Projects

Development of a Web/Email based Search Engine

 

Title of the project

 

Development of a Web/Email based  Search Engine

 

Abstract of the project

 

This project is aimed at developing an Search Engine that is of importance to either an organisation or a college. This Intranet based stand-alone application can be accessed through a web browser or e-mail. This application gives the search results for the  “keyword” .

 

Keywords

 

Generic Technlogy keywords

 

Databases, Networking, File Management, Programming

 

Specific Technology keywords

 

MS-SQL server, HTML, ASP, JSP, C, C++

 

Project type keywords

 

Analysis, Design, Implementation, Testing, User Interface

 

Functional components of the project

 

Following is a list of functionalities of the system. More functionalities that you find appropriate can be added to this list. And, in places where the description of a functionality is not adequate, you can make appropriate assumptions and proceed.

 

Search Engine is both Web and E-mail form.. User can fetch the search results by typing the url in a browser, or by sending a e-mail to the site.

 

  1. User should be able to search for the “keyword” in all file formats.
  2. User should be able to search other sites to search documents.
  3. User should be able to send a mail to the site and Search Engine should  send back the top 20 search results back to the user.
  4. User can send one of search results to the site to fetch back the appropriate document.
  5. User should be able search documents on ftp sites as well.
  6. Indexing of the search sites should be periodic.
  7. The search results would  be displayed using paging logic.
  8. The search screen should allow the user to filter search criteria like file type, file size , other web sites.

 

Steps to start-off the project

 

The following steps will be helpful to start off the project.

 

  1. Study and be comfortable with technologies such as
    1. Active Server Pages/HTML/JSP/Java and SQL server.

 

            Some links to these technologies are given in the ‘Guidelines and References’ 

            section of this document

 

  1. Search algorithms should be able to parse most of the commonly available files  e.g. pdf, doc, xls , html, htm

 

  1. The mail to address, subject line for the mails  to be identified, and need not be hardcode. They can be picked from configuratuion files or database.

 

  1. Make a database for storing indexed results of the sites. So the same can be used for caching of the results.

 

  1. Create the help-pages for the system in the form of Q&A. This will help you also when implementing the system

 

Requirements

 

Hardware requirements

 

Number

Description

Alternatives (If available)

1

PC with 2 GB hard-disk and 256 MB RAM

Not-Applicable

 

 

 

 

Software requirements

 

Number

Description

Alternatives (If available)

1

Windows 95/98/XP with MS-office

Not Applicable

2

MS-SQL server

 

3

SMTP Server

NA

 

Manpower requirements

 

2 to 3 students can complete this in 4 – 6 months if they work fulltime on it.

 

Milestones and Timelines

 

Number

Milestone Name

Milestone Description

 

 

Timeline

 

Week no. 

from the start

of the project

Remarks

 

 

1

Requirements Specification

Complete specification of the system (with appropriate assumptions) constitutes this milestone. A document detailing the same should be written and a presentation on that be made.

2

Attempt should be made to add some more relevant functionalities other than those that are listed in this document.

2

Technology familiarization

Understanding of the technology needed to implement the project.

6

The presentation should be from the point of view of being able to apply it to the project, rather than from a theoretical perspective.

3

Database creation

A database of atleast 100 entries of search pages should be created.

8

This database can be used for caching purposes also.

4

High-level and Detailed  Design

Listing down all possible scenarios (like searching through the website, searching through email, search which doesn’t return any pages etc) and then coming up with flow-charts or pseudocode to handle the scenario.

11

The scenarios should map to the requirement specification (ie, for each requirement that is specified, a corresponding scenario should be there).

5

Implementation of the front-end of the system

Implementation of the main screen giving the keyword search box, search results display screen etc

15

During this milestone period, it would be a good idea for the team (or one person from the team) to start working on a test-plan for the entire system. This test-plan can be updated as and when new scenarios come to mind.

6

Integrating the front-end with the database and the application logic

The front-end developed in the earlier milestone will now be able to pass on the keywords typed in the box to the application logic and get the search results which will be updated in the database

16

 

7

Integration Testing

The system should be thoroughly tested by running all the testcases written for the system (from milestone 5).

17

Another 2 weeks should be there to handle any issues found during testing of the system. After that, the final demo can be arranged.

8

Final Review

Issues found during the previous milestone are fixed and the system is ready for the final review.

19

During the final review of the project, it should be checked that all the requirements specified during milestone number 1 are fulfilled (or appropriate reasons given for not fulfilling the same)

 

Guidelines and References

 

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnasp/html/asptutorial.asp (ASP tutorial)

 

http://www.functionx.com/sqlserver/ (SQL-server tutorial)

 

http://www.sourceforge.net



Tags :
0
Your rating: None