Big Data Indexing: Taxonomy, Performance Evaluation, Challenges and Research Opportunities

Abubakar Usman Othman(1), Timothy Moses(2), Umar Yahaya Aisha(3), Abdulsalam Ya’u Gital(4), Boukari Souley(5), Badmos Tajudeen Adeleke(6),

(1) Federal University Lafia
(2) Abubakar Tafawa Balewa University
(3) Gombe State University
(4) Abubakar Tafawa Balewa University
(5) Abubakar Tafawa Balewa University
(6) Industry and Innovation Institute, Sheffield Hallam University


In order to efficiently retrieve information from highly huge and complicated datasets with dispersed storage in cloud computing, indexing methods are continually used on big data. Big data has grown quickly due to the accessibility of internet connection, mobile devices like smartphones and tablets, body-sensor devices, and cloud applications. Big data indexing has a variety of problems as a result of the expansion of big data, which is seen in the healthcare industry, manufacturing, sciences, commerce, social networks, and agriculture. Due to their high storage and processing requirements, current indexing approaches fall short of meeting the needs of large data in cloud computing. To fulfil the indexing requirements for large data, an effective index strategy is necessary. This paper presents the state-of-the-art indexing techniques for big data currently being proposed, identifies the problems these techniques and big data are currently facing, and outlines some future directions for research on big data indexing in cloud computing. It also compares the performance taxonomy of these techniques based on mean average precision and precision-recall rate.


Indexing; Similarity search; Matching; Big data; Cloud Computing

Full Text:



