The Case for Open Source Search in the Enterprise

November 1, 2010 Off By David
Grazed from GigaOM.  Author: Haydn Shaughnessey.

With enterprise data predicted to grow 650 percent between 2009 and 2014, the pressing question among IT departments is how to manage this data and make it easily searchable. While Google’s Search Appliance and offerings from search market leader Autonomy are options, these proprietary models often mean an increase in cost, either with the number of seats on a search software license or with the volume of searches in a given year, and some enterprises are seeking alternatives.

Otis Gospodnetic, co-author of Lucene implementations primer “Lucene in Action,” says, “You would be surprised to know how many services and applications use Lucene. For example, did you know that Apple uses Lucene in Safari and iTunes?” He adds that Lucene powers Spotlight, the search functionality in OSX, and browser history in Safari. CNET, Netflix, Cisco, the Guardian, LinkedIn and MTV use it as well.

What’s driving a wider adoption of open source? The open source community promotes the increased scalability, flexibility and speed over enterprise-specific behemoths like Autonomy, Microsoft and Google. From a company’s perspective, Enterprise 2.0 and the socialization of knowledge create both unstructured data and a demand for knowledge as content.

Other factors include:

Enterprise 2.0. Enterprises face a growing array of data types. On top of relational databases, companies now generate data through wikis, legacy bulletin boards, internal online communities, email, customer interaction points, PDFs, video, images and Microsoft Office files. Estimates of these volumes vary, but a ratio of 80 to 20 unstructured-to-structured data in the enterprise seems to be sticking. An inability to organize this unstructured data and simultaneously access different structured data files is a problem for business intelligence, adaptability, responsiveness and compliance.

People and video. Cisco’s Pulse project, which uses Lucene/Solar as a platform, is a good example of an enterprise search application made possible by open source. It allows employees to find one another based on their current interests, even down to their latest corporate video appearance. Pulse tags content and media across enterprise networks and makes them searchable in real-time. The benefit of Lucence/Solr as a platform for the project, according to Cisco, is its scalability and speed.

The cloud. Cloud computing and SaaS will present new challenges for data discovery and management as enterprises migrate data into cloud environments and expect immediate access to their assets through search.

There are barriers to the adoption of open source. Many enterprises have poor search and discovery platforms that don’t index and return data well, often because of poor index or topology management. Many have platforms with price plans that were designed when data access rights belonged to small teams, as opposed to the larger open source community.

However, the ability for customers to open up the search box and modify it for their own purposes and to include advanced features like filtering at no cost should see open-source search emerge as a market force in the enterprise, the cloud, SaaS and in e-commerce. Ultimately, the biggest driver may prove to be demand on the part of employees to make the enterprise environment look and work like the consumer-driven web.