1 /* Copyright (c) 2003 The Nutch Organization. All rights reserved. */ 2 /* Use subject to the conditions in http://www.nutch.org/LICENSE.txt. */ 3 4 package net.nutch.clustering; 5 6 import net.nutch.searcher.HitDetails; 7 8 /** 9 * An extension point interface for online search results clustering 10 * algorithms. 11 * 12 * <p>By the term <b>online</b> search results clustering we will understand 13 * a clusterer that works on a set of {@link Hit}s retrieved for a user's query 14 * and produces a set of {@link Clusters} that can be displayed to help 15 * the user gain insight in the topics found in the result.</p> 16 * 17 * <p>Other clustering options include predefined categories and off-line 18 * preclustered groups, but I do not investigate those any further here.</p> 19 * 20 * @author Dawid Weiss 21 * @version $Id: OnlineClusterer.java,v 1.1 2004/08/09 23:23:52 johnnx Exp $ 22 */ 23 public interface OnlineClusterer { 24 /** The name of the extension point. */ 25 public final static String X_POINT_ID = OnlineClusterer.class.getName(); 26 27 /** 28 * Clusters an array of hits ({@link HitDetails} objects) and 29 * their previously extracted summaries (<code>String</code>s). 30 * 31 * <p>Arguments to this method may seem to be very low-level, but 32 * in fact they are side products of a regular search process, 33 * so we simply reuse them instead of duplicating part of the usual 34 * Nutch functionality. Other ideas are welcome.</p> 35 * 36 * <p>This method must be thread-safe (many threads may invoke 37 * it concurrently on the same instance of a clusterer).</p> 38 * 39 * @return A set of {@link HitsCluster} objects. 40 */ 41 public HitsCluster [] clusterHits(HitDetails [] hitDetails, String [] descriptions); 42 } 43