Enterprise Search White Papers and Presentations

Below are links to our mini white papers, addressing questions about enterprise search and more.

Enterprise Search Basics (pdf:157,797) Enterprise Search and Government (pdf:112,083) Enterprise Search for Law Firms (pdf:110,350) Enterprise Search and E-Discovery (pdf:108,855)

Please contact us for additional information.

Migration from FAST ESP to Lucene Solr, by Michael McIntosh, VP, Enterprise Search Technologies, TNR Global

Video from the Lucene Revolution conference in Boston, MA.

from-fast-esp-to-solr

Migration from FAST ESP to Lucene Solr (PDF) (pdf:4,067,091)
presented by Michael McIntosh, VP, Enterprise Search Technologies, TNR Global at the Lucene Revolution Conference in Boston, MA.
There are many reasons that an IT department with a large scale search installation would want to move from a proprietary platform to Lucene Solr. In the case of FAST Search, the company’s purchase by Microsoft and discontinuation of the Linux platform has created an urgency for FAST users.
This presentation compares Lucene/Solr to FAST ESP on a feature basis, and as applied to an enterprise search installation. We explore how various advanced features of commercial enterprise search platforms can be implemented as added functions for Lucene/Solr. Actual cases are presented describing how to map the various functions between systems.

Enterprise Search Glossary

A-F | G-O | P-T | U-Z | Next API – Application Programming Interface An interface that lets developers access functions of a hardware or software platform. An API allows an application to make requests of the operating system or of another application. BI – Business Intelligence BPM – Business Process Management Area of BI CAD – Computer-Aided Design CRM – Customer Relationship Management CSR – Customer Support Representative DBMS – DataBase Management System Software that controls the organization, storage, retrieval, security and integrity of data in a database EA – Enterprise Application A software application hosted on a server which simultaneously provides services to a large number of users, typically over a computer network ECM – Enterprise Content Management Capturing, managing, storing, and delivering content and documents pertaining to a company’s organizational processes EDM – Enterprise Document Management See ECM EFQM – European Foundation for Quality Management EP – Enterprise Portal A secure web portal for use within an organization to allow employees and partners to search and access corporate information. Features may include personalization, search, collaboration, content management. ERP – Enterprise Resource Planning Multi-module application software that helps manage diverse aspects of a business, which may include product planning, accounting, inventories, suppliers, customer service, and orders. Allows different departments to share information and communicate with each other. ES – Enterprise Search A search engine that indexes and searches content securely within a corporate Intranet. ESP – Enterprise Search Platform Product of Fast Search & Transfer KM – Knowledge Management How an exterprise gathers, organizes shares and analyzes its information. KM may include enterprise search, content management, business intelligence, and more. LAMP – Linux, Apache, MySQL, PHP, Python, Perl Stack of Open Source Technologies OEM – Original Equipment Manufacturer OLAP – Online Analytical Processing a software tools that analyze data stored in a database PLM – Product Lifecycle Management PM – Project Management RDBMS – Relational Database Management System. A type of DBMS in which the database is organized and accessed according to the relationships between data values. The RDBMS was invented by a team lead by Dr. Edmund F. Codd and funded by IBM in the early 1970’s. ROI – Return on Investment SE – Search Engine SEO – Search Engine Optimization (has little to do with ES) SKU – Stock Keeping Unit which is a specific number designating one specific product SKF – Strategic Knowledge Framework TCO – Total Cost of Ownership WCM – Web Content Management An application used to create and manage web page content, including text, graphics and photos, video or audio. WCM may also catalog or index content and provide the ability to personalize content for specific users. XML – Extensible Markup Language is a W3C (World Wide Web Consortium) initiative that allows information and services to be encoded with meaningful structure and semantics that computers and humans can understand. XML is great for information exchange, and can easily be extended to include user-specified and industry-specified tags

How Enterprise Search Works

An enterprise search engine has two components: a front end and a back end. Both work together with the search index. The index is built statically for search speed, and is updated periodically. This is unlike a database where the indexes are updated in real time when data is changed or added.

Enterprise Search System

ent_chart

Back End:

Creating and updating the index.

  • Crawler – The crawler module reads and collects web pages and follows the links between them, starting with a list of initial URLs.
  • Document Processor – The document processor module processes web pages received from the crawler, as well as information received from databases through a ‘database connector’ and information from directories of files. The document processor takes the meaningful text from the documents, no matter the type or format, and adds whatever meaningful ‘meta-data’ it can determine, such as title or authors.
  • Indexer – The indexer module does the brute force work of creating and maintaining the index from the information it receives from the document processor.

Front End:

Responding to user queries.

  • Web server – The user’s web browser fetches a web page from the web server that contains a search form. The user then enters a query and the web browser sends the request to the web server.
  • Query processor – The web server sends a request with the user’s query to the query processor. The query processor properly formats the request and sends it to one (or more) search modules, collects the results and sends them to the web server for final formatting.
  • Search Engine – The search engine module receives the request from the query processor and does the actual searching inside the index that was created by the indexer.

To deliver an enterprise search solution to meet your organization’s needs, a number of components need to be incorporated

Connectors

Allow search engines to gather information from various sources (structured databases, unstructured documents on internal and remote servers, desktop computers) in your enterprise as well as the external web. Specialized connectors are available for almost every type of file format and application, and custom connectors can be designed as needed.

Relevancy Tools

Build a customized ranking model that delivers content based on concepts, context, date, authority, completeness, geography, statistics, and quality. Tune each element to match your business needs.

Linguistics Tools

Identify synonyms (search for “Great Britain” will include results for “England” as well), abbreviations, phrases, idioms, part of speech, and misspellings. Lemmatization matches regular and irregular grammatical forms. Type in ‘goes,’ and you can also find ‘went.’ Prefixes and suffixes can be disregarded, if needed. Certain words can be skipped. The system knows the difference between the ‘wind that blows’ and ‘wind your watch.’ Phonetic search allows you to find results based on phonetic similarities, especially useful for names. The system can recognize queries containing who, what, where, when, or why, and provide appropriate results.

Sentiment analysis

Determine if a document has a negative or positive tone. Use this tool to monitor user groups, to analyze reviews of your products and services, or to follow press coverage of your business.

Business rules

Each organization has specific guidelines for how it does business, and these guidelines can be implemented in an enterprise search solution. If you have, for example, two levels of advertisers who pay for placement, the higher level advertiser can be assigned to appear in a separate format on top of the results list. When customers have worked with you before, use business rules to determine which search results are most relevant to their specific needs.

Performance, Scalability and High Availability

Enterprise search needs to be fast, and it needs to be reliable. TNR Global designs your hardware specifications to meet speed requirements and to offer a failover option in case of power outages or server issues. Search engine architecture is scalable, and you can increase index capacity, indexing rate, and/or query processing speed as needed.

Benchmarking

You can’t manage what you don’t measure. One of the most valuable tools in enterprise search is the ability to learn from the system and measure results. With benchmarking, you track how your users respond to the search engine, and how the search engine responds to them. Metrics include click counting, hardware performance, and quantity of data searched. You can track which queries return no results and then tweak synonyms or taxonomy to provide answers to these queries.

Security

Security is a vital component for any enterprise application. With enterprise search, security guidelines apply both to the documents and to the user. Each set of documents can have its own security settings. You can restrict documents from the search engine, or allow the search engine to index the documents, but restrict access to certain groups of users. Security checks are performed when a set of results is queried and as the indexer crawls the data source.

Web Search vs. Enterprise Search

Most people associate search with Google or Yahoo. But enterprise search is a different type of solution.

Web search

Word association – when someone says ‘search’, is your next thought ‘Google?’ How Google works: People who create public web sites design them to be found. Google indexes mostly homogeneous html pages that contain metadata and special tags. Moreover, the strength of Google’s search engine is based, in large part, on an algorithm that tracks how sites are linked to other sites. In a simplified case, the more incoming links a page/site has, the higher the ranking in search results.

Enterprise Search

Enterprise information often resides in non-interconnected and even legacy systems, in a wide variety of file formats. Data is located in internal directories, emails, manuals, technical specifications, CMS (content management software), CRM (customer relationship management) software, and the list goes on (and on). Compare enterprise data to the html pages that Google searches. Where is the network of links? Where are the metadata tags that help Google find pages? In most cases, they don’t exist.

Paper about Enterprise Search Basics (pdf:157,797)

More about Enterprise Search Benefits

Enterprise Search Benefits

When companies begin looking for a search solution, it is easy to become overwhelmed by the variety of software and vendor choices available, and the complexity of the technology. The benefits of enterprise search, however, are clear.

Improved decision making

Don’t make decisions in the dark. Find the answers you need from sales records, internal memos, research, financial reports, relevant industry web sites and more.

Lower call center costs

With enterprise search implemented on your external web site, customers are able to find answers to technical or sales questions easily, so they don’t need to call. And when they do call, you spend less on call center costs when when your representatives are able to find information quickly.

Higher productivity

An enterprise search solution can reduce the time your employees spend looking for information by 15 to 30 percent*, and reduce time employees spend recreating documents that are ‘somewhere out there in the system’ but which are too difficult to locate. (*IDC: ‘The High Cost of Not Finding Information,’ Susan Feldman and Chris Sherman, April 2003, IDC #29127)

Better communication

With the rise of telecommuting and outsourcing, you may have workers in different time zones and on different continents. When your London office needs an answer, and it’s 3:00 am in NY, calling the manager in NY may not be the best option. Wouldn’t it be better if that answer was available through internal search?

Regulatory Compliance

Reduce the cost and the time invested in complying with government or industry regulations.

Web based revenue

Advertiser driven? Subscription based? Retail transactions? Whether you profit by selling information, ads or bungee cords, enterprise search can help. With enterprise search, you can easily offer different levels of access for subscribers. Increase advertiser revenue by coordinating banner ads with customer data. Selling shoes? Have the search engine suggest a pair of matching socks. No matter what the business, customers want to find things quickly. As many as half of prospective customers will leave after two to three clicks if they can’t find what they need. Implement enterprise search, make search easy, and turn prospects into sales.

Proven ROI

A landmark IDC study showed that “an enterprise with 1,000 knowledge workers wastes $48,000 per week ($2.5 million per year) due to an inability to locate and retrieve information.”* $2.5 million is a lot to pay for lousy search. Even though this study is nearly a decade old, as content grows at a staggering pace, the message only gains value. Organizations will need a powerful search engine tuned to there business to unlock the information they need buried inside their own enterprise. (*IDC:The High Cost of Not Finding Information, Susan Feldman and Chris Sherman, April 2003, IDC #29127)

Enterprise Search Applications

Enterprise search has applications in a variety of areas and industries. Within a single enterprise, there may be many departments that can benefit from a customized enterprise search solution.

  • Corporate Operations – intranet, site search
  • eCommerce – online sales
  • Market Management – analyze customer sentiment, product reviews, track the competition
  • Online Media and Publishing
  • OEM Search Integration
  • Risk Management
  • Surveillance and Enforcement

Start by contacting us. We will work with you to prepare a full analysis detailing how an improved enterprise search can benefit your organization. Our mission goes beyond a single implementation. We help companies grow, and we value long term business relationships.

Open Source Search Solutions

TNR Global provides enterprise search implementation services throughout the entire implementation cycle.
We help evaluate different vendor options, audit existing solutions, implement new solutions, upgrade existing solutions, and provide ongoing support for implemented solutions.

We specialize in Lucene Solr development and implementations. We also have experience with other open source search systems: ElasticSearch for Big Data, SearchBlox, Sphinx, Hadoop, HBase, Lemur/Indri, Nutch, SWISH-E, and OpenFTS. Contact us for a free consultation.

solr_FCelasticsearch_smalllucene_logo1             hadoop_small

 imagesCABWQ4PZ                  logo_redhat               logo_mysql

 openfts                   logo_lemur_sm              logo_linux

      

.