Thursday, August 1, 2013

Content management hub to host big data

Content is getting bigger by the day and smarter as well. As content grows in size and becomes varied in structure, discovery of valuable and relevant content becomes a challenge. Existing content management products are limited by scalability, variety, rigid schema, limited indexing and processing capability. Content enrichment often is an external activity and not often deployed. The content manager is more like a content repository and is used primarily for search and retrieval of the published content. Existing content management solutions can handle few data formats and provide very limited capability with respect to content discovery and enrichment.

With the arrival of Big Content, the need to extract, enrich, organize and manage the semi-structured and unstructured content and media is increasing. As the next generation of users will rely heavily on the new modes of interacting with the content for e.g., mobile devices and tablets , there is a need to re-look at the traditional content management strategies. Artificial intelligence will now play a key role in information retrieval, information classification and usage for these sophisticated users. To facilitate the usage of Artificial Intelligence on this Big Content, there is a need to have knowledge on entities, domain, etc., to be captured, processed, reused, and interpreted by the computer. This has resulted in formal specification and capture of the structure of the domain called ontologies. Classification of these entities within the domain into predefined categories called taxonomy and inter-relating them to create the semantic web.

Following architecture proposed by the authors [6] addresses the unique needs of the big data content management system. Many of the design elements shown below are finding their way to reality in some of the larger companies.

big data content management system

References
  1. Agichtein, E., Brill, E. and Dumais, S.(2006), Improving web search ranking by incorporating user behavior,  http://research.microsoft.com/en-us/um/people/sdumais
  2. Dumain, S. (2011), Temporal Dynamics52 and Information Retrieval, http://research.microsoft.com/en-usum/people/sdumais
  3. Reamy, T. (2012), Taxonomy and Enterprise Content Management, http://www.kapsgroup.com/presentations.shtml
  4. Reamy, T. (2012), Enterprise Content Categorization – How to Successfully Choose, Develop and Implement a Semantic Strategy, http://www.kapsgroup.com/presentations/ContentCategorization-Development.pdf
  5. Barroca, E. (2012), Big data’s Big Challenges for Content Management, Tech News World, http://www.technewsworld.com/story/74243.html
  6. Sudheeshchandran Narayanan and Ajay Sadhu (2013), Big Data Powered Extreme Content Hub, http://www.infosys.com/infosys-labs/publications/Documents/bigdata-challenges-opportunities.pdf