Managing Search Complexity through Simplicity

At the heart of any search solution is a good understanding of the business problem to be solved as well as knowledge of the available content and metadata. You have to work within the confines of the content you have (or can add with content enhancement). You have to analyze that content and be able to describe how you can identify some documents as being relevant for solving a business problem and why other documents are not relevant. That is just the beginning.

Developing an effective search solution is a complex space by it’s very nature. It brings many different pieces together to accomplish specific business objectives. Managing the complexity of search is no simple matter. One of the top challenges faced by businesses is managing this complexity. The key is managing solution complexity and providing an effective solution is to simplify the scope of the solution by dividing the space into meaningful layers:

Business Problem to Solve
     The end goal is to provide a space where employees, customers and contractors can search for and find accurate and timely information. After all, their job is not to search but to complete some other business process or transaction. Search is a tool for helping people complete their tasks successfully. Understanding the business problem to be solved and how it relates to the larger strategic plan of the business is vital for creating a successful search space.

User Environment
    Understanding how many people will be using the system, how they will access it (web, intranet, internet, mobile devices, embedded applications, etc.), and the business processes to be supported.

Search Application
    Consists of the Query and Results sub-layers. It is a user interface which consists of various search tooling functionalities. People’s impression of the effectiveness of a search solution is often based on the search interface alone. After all, that is what people see and use. Too often businesses use an out-of-the-box interface without evaluating whether it is designed to solve the business problems driving the need to update search.

Search Tooling
    Search tooling is the tool set that a search platform provides to build search applications. Tooling may include search word boosting, relevance tuning, thesaurus, synonyms, stopword lists, facets, taxonomies, search analytics, knowledge extraction tooling, and more. Also included are how content sources are crawled, parsed and indexed. Different search platforms provide different tool sets. Understanding what tooling is available and how they work is key to being able to architect an effective search solution.

Search Platform/Engine
    The software that provides the search tooling; whether it is IBM OmniFind Enterprise Edition, Endeca, Autonomy, FAST, Lucene/SOLR or some proprietary solution. In addition to search tooling you also want to pay attention to a platform’s scalability, fail-over, disaster recovery, system management, configuration management, system security and availability.

Content Enhancement
    Content enhancement or enrichment is sometimes required in order to develop a search solution that will solve a specific business need. This may mean that third-party data needs to be added to existing data. It may mean that knowledge extraction tools need to be used for unstructured (that is, non-fielded data) data like that found in emails, reports and memo fields in databases.

Content & Metadata
    It is important to know the number of documents, content types (email, reports, database records, etc.), average size of documents, annual growth of your data stores, multi-lingual requirements, governance strategy. It is also important to know the types of information that is explicitly available or can be extrapolated via content enhancement. This information will be the basis for building an effective search solution.

Security
    Security is a vital piece of any enterprise project. In the case of search, there is user authentication & authorization for accessing the search application and for the data that will be displayed in search results lists. (You don’t want just anybody to view HR data.) Search engine crawlers also require authentication and authorization for access to different data stores at the collection level. And then there is security digital rights management at the individual document level within a given data store. In many cases this is the most complex IT piece of the search puzzle. However, solving this layer alone does not guarantee an effective search solution.

Storage Environment
    Knowing the number and types of data stores (portal, file system, FileNet8, Domino, Quickr, Documentum, etc.) is vital. Knowing how frequently the stores are updated and will need to be searched is another key piece of information. It is also know the format that data is stored in (PDF, database, flat files, etc.).

IT Infrastructure
    All of the above occurs within an IT framework of many layers in its own right and can include the following layers: network, hardware, operating system, system software and more depending on the environment.

New OmniFind Enterprise Edition Coming

The word on the street is that the next version of IBM’s OmniFind Enterprise Edition (OEE) will use Lucene as it’s core. It will be interesting to see what new features this will bring to the OmniFind universe.  I have all sorts of questions about what the next version can do.

- How will the use of Lucene affect performance?
- Will the new version include facets (finally)?
- How will categorization be handled?
- How will UIMA be incorporated with the new OEE?
- How will it handle crawl-time and query-time security needs?
- How will it handle plugins?
- Will the document size restriction be increased?
- Will it handle wild-card searching in a way that won’t increase the size of the index files X times over?
- What kind of user and system analytics will be available?
- Will you be able to do search word boosting?
- Will you be able to relevance tuning?

Leave a comment and let me know what questions you have about the next version!

Search and WorkForce Integration Initiatives: Two Paths

Increasingly I am seeing more projects focusing on “Search and WorkForce Integration“.    Search plays a key part in these initiatives.   The goal of search then is to provide accurate and timely information.  The gist of the business problem is this:
 
               If workers can’t find the information needed, 
               they either have to reinvent it or decide without it.
 
At some stage along the process of implementing a WorkForce Integration initiative, a company may find itself at a point where people are not using search because it “just doesn’t work right.”
 
At this point, companies respond by taking one of two paths. One path leads them out of the woods and the other gets them lost deeper in the forest.
Path 1.  The first path is a common IT response and that is to thow more hardware and software at the problem. This response can be valid if search is slow and unable to handle indexing multiple file formats, meeting security needs, and similar issues.   However, the problem is often not an IT problem but a business problem.   If you treat search as an IT problem, then search will likely to never work right.  You will just get lost deeper in the forest.
 
Path 2.  The second path is to understand that people’s jobs is NOT to search for information.   Their jobs are to complete various tasks such as analysis, evaluation, support, processing, etc.   They search for information to find out latest information, identify resusable resources, solve problems, provide answers.  If you want to know why people aren’t using search, the answer is straight-forward:
       An information retrieval system will tend not to be used whenever it is
       more painful and troublesome for a customer to have information than
       for him not to have it.”  Calvin Mooers  (aka Mooers’ Law, 1959)
To solve the search problem, you must understand the business problem and search can be used to meet those needs – and make it easy for people to use the tool.
This is the area in which Davalen shines.  Our experience goes much further than the ability to install, configure and manage the OmniFind Enterprise Edition search application.  We know how to use the tooling to solve business problems.
 

Converting Online Shoppers to Buyers

At Davalen we recently conducted research that showed that 52% of IBM WebSphere
Commerce Server websites are neglecting as many as half of their potential customers by not providing them with the search tools they need.  You can read and excerpt and download the paper from here.

Making Your Customers Thirsty

Search Personalization for Retail Web Sites: Extend the Reach of Your Website Beyond One or Two Target Audiences (Webinar)

One of the biggest challenges a retail website has is how to provide a meaningful online  experience to multiple audiences.  To meet the challenge you need the ability to:

  • Organize your catalog to make the most sense to your users

  • Sequence products within your catalog for an appealing display for each customer.

  • Dynamically change catalog organization product listing sequences based on customer type.

You can view the webinar via this link:

http://www.davalen.com/info.php?about=Search_Personalization

Ten Mistakes to Avoid When Implementing Site Search for Internet Retail Sites

Setting up and using Site Search is a black box mystery for many people.    That makes it very difficult to know if you are falling into a huge pit or missing a golden opportunity.  To help judge where you might be I’ve put together a list of mistakes and missed opportunities.

    10. Assuming that you will get the custom results that you want with an out-of-the-box solution.9. Not supporting search individualization/personalization.8. Not supporting marketing-based search tuning.7. Restricting the role of your search platform to return only search results.

    6. Expecting the same results with your new search platform as you got with your old platform

    5. Not tuning search based on site search analytics so the best products appear at the top of a result set. 

    4. Not checking the contents of a result set against your product catalog.

    3. Bad, messy or incomplete data.

    2. Not identifying and designing the data you need to drive the search results you want

    1. Not identifying and defining search scenarios at the start of your project.

 Remember!

    The bottom line is this…

                      “If they can’t find it, they won’t buy it!”

Beyond Search: Expanding WebSphere Commerce-Driven Websites with OmniFind

Recognizing the potential of OmniFind in a WebSphere Commerce site requires a different mindset.   Here are some thoughts to get you started:

Think of OmniFind as an integration tool between the front end User Interface and the WebSphere Commerce database.  

Any time you want to display a set of products – think OmniFind

Any time you want to let users dynamically navigate a product set – think OmniFind

Any time you want to generate navigation dynamically based on information within a set of products – think OmniFind

Any time you want to create associations between products based on common attributes – think OmniFind

Any time you want to display a subset of products based on user actions, user credentials, user choices or other conditions – think OmniFind

Any time you want to bring together information from different sources beyond the WC catalog database – think OmniFind

If you want to associate data from different sources around a specific product, topic, or category – think OmniFind

ODE Chronicles: Quick Tip – Too May SnapShots?

Ever run out of space on your servers because of too many snapshots? 

Try this feature to limit how many will be kept after builds.

ADMIN_MAX_DEPLOYMENT_DIRS

Link to IBM InfoCenter entry

At the Intersection of Technology, Metadata, and Goals

Discovery Architecture lives at the intersection of technology, metadata, user goals and business goals.   

 

In order to be successful, you have to take all four aspects into account when you build a search /discovery framework.

 

User goals and business goals blend together into sets of goals and form a business strategy. 

These sets of goals inform you about the direction you need to head.   What they don’t do is tell you necessarily is how to get there.    But have no fear!  One of the great things about search technology is that it increases the number options available to you. 

 

The obvious side of this part of the equation is that you have to understand how user and business goals fit (or don’t fit) together.  A situation that I’ve run into where this can sometimes turn into a “gotcha” takes place when business analysts map out solutions (some very elegant) – without taking into account what current technology can support.  In these cases, they are building a Discovery Architecture without considering other parts of the equation.

 

Look to your technology to shape your solution.  

Technology will both enable and constrain what you will be able to implement – even if you have an unlimited budget for customizations.  At the same time, your technology is also likely to open up opportunities that you may have not considered before. 

 

So, I suggest to my clients, especially ones who have built solutions without an eye towards the technology they are using, to keep an open mind to consider how a given technology might accomplish the same goal but take different steps within its own framework.  This will help keep down costs from requesting unnecessary customizations.  To help keep your options open, ask these questions:

 

-          What features does the technology offer that we haven’t requested that may be relevant to our goals?

-          Are there places where you can eliminate unnecessary customization by leveraging the technology’s supported feature set?

 

Metadata is the foundation for the framework you build.

Let’s be blunt.  To borrow an old saying, “You can make a silk purse out of a sow’s ear.”

 

 

No matter how sound the business strategy…

 

No matter how good your technology…

 

If you don’t have good metadata…

 

Your plan will fail!

Discovery Architecture – In A Nutshell

While on the plane tonight, I came up with a brief, elevator-speech description of Discovery Architecture.

Discovery Architecture
Using technology to interact with and manipulate metadata to enable users to meet their search/discovery goals in ways that also meet your business goals.