26
Avalanc Avalanc he he Internet Data Management System

Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Embed Size (px)

Citation preview

Page 1: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

AvalancheAvalanche

Internet Data Management System

Page 2: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Presentation planPresentation plan

1. The problem to be solved2. Description of the software needed3. The solution4. Avalanche features and advantages5. Avalanche detailed description6. Instruments and technologies used

Page 3: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Internet SurfersInternet Surfers

This task is:To gather and to store Web-information.

These groups are: Regular Internet users

collecting information on their hobby (basketball news, cooking recipes, pets info, etc.)

Analysts with the task to gather and sort Internet data (e.g. for Gartner Group, Bloomberg or IDC).

There are two different Internet users groups having to fulfill the same task day by day.

Page 4: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Step 1 to solve the taskStep 1 to solve the task

1. User needs to run some search or meta-search engine (e.g. Google, Yahoo, Copernic) and define the search query.

Let’s keep in mind that different search engines have different syntactic rules for building the request and they return very different results for the same request.

So, to make the search more or less complete one needs to repeat it several times with different search and meta-search engines with different syntactic rules to build the requests.

Page 5: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Steps 2, 3 to solve the taskSteps 2, 3 to solve the task

2. User needs to look through each screen of each output of each search engine thoroughly to filter only the sites with the information that seems to be what he is looking for.

 3. User needs to validate each of the filtered

connections to understand whether they are alive or not.

Page 6: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Steps 4, 5 to solve the taskSteps 4, 5 to solve the task

4. User needs to enter each of the sites that have passed validation procedure and to load its content to his local computer.

 5. User needs to check few more links at

each of the sites to load the content of the linked sites that is interesting to him.

Page 7: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Steps 6, 7 to solve the taskSteps 6, 7 to solve the task

6. After downloading all the data needed one has to make few steps offline. First of all he has to examine all the downloaded files thoroughly to place each of them to the corresponding subfolder of his file system folder designated to store files downloaded from Internet.

 

7. Now, to find any file by keywords among the files stored user could only use standard Windows search system of very limited abilities (no hyperlinks, no cookies, etc.).

Page 8: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

ConclusionConclusion

It was an absolutely fair description of the steps every user should take each day to get and to use the information he needs.

Use of some helpful tools and hints (iHarvest software, Telnet software, MyYahoo module, schedulers, etc.) does not change the situation substantially.

Page 9: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Special tool neededSpecial tool needed

Nowadays market lacks software that would be designated to do the following:

  Search for information through the Web on

regular basis. Try links found and filter Internet content. Collect filtered data. Classify collected data. Store classified data providing the ways of

flexible and comfortable access to stored data.

Page 10: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Why is there no software like Why is there no software like this now? this now?

Each of the existing software packages solves the problem partially (covering little part of the problem).

A software tool to solve the problem as a whole should be considerably complex. It should combine modules of substantially different functionality:

Surfing Web and downloading Internet-content Classifying downloaded information Storing data with comfortable access to it

Complexity of some of these modules is usual programming complexity, and the task of classifying is not an easy mathematical task.

Page 11: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

We did it! We did it! We have developed a software system called Avalanche

Avalanche is an Internet Data Management System.

IDMS Avalanche contains a number of new generation tools for: knowledge mining; knowledge storing; knowledge representing.

Page 12: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

AvalancheAvalanche has a number of has a number of competitive advantagescompetitive advantages

Avalanche beats main competitors in:

Extended syntactic data searchAutomatic filtration of data foundSemantic data classification

Page 13: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

AvalancheAvalanche is a single product with a is a single product with a number of number of logically connected logically connected functionsfunctionsSyntactic and semantic definition of

necessary information.Means of scheduled data search in WWW.Semantic filtration and classification of

incoming data.Means of creating user’s personal

encyclopedia.

Page 14: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche
Page 15: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Syntactic and semantic definition Syntactic and semantic definition of of necessarynecessary information informationAvalanche includes Internet Classifier that provides tools for building the Semantic Catalogue. This Catalogue defines the structure of necessary information.

The folder in the Semantic Catalogue to place new document is defined in terms of: presence or absence of certain words and phrases in the new document; computable proximity of new document to number of sample documents.

Page 16: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Example of syntactic and semantic definitionExample of syntactic and semantic definition

Page 17: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Means of scheduled data Means of scheduled data search in World Wide Websearch in World Wide Web

Avalanche includes Internet Spider that provides:

scheduled automatic search of requested information

in the Web;

automatic links following;

automatic validation of links found;

copying of found information from Internet to the

user’s local computer.

Page 18: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Example of scheduled data searchExample of scheduled data search

Page 19: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Semantic filtration and Semantic filtration and classification of incoming dataclassification of incoming dataAvalanche Internet Classifier provides:

Automatic classification of copied information in accordance with the Semantic Catalogue structure.

Storage of classified information. Information is stored on the local computer in an efficient way.

Re-classification of stored information. You can change your mind and reclassify information already received from Internet.

Page 20: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Example of semantic filtration and classificationExample of semantic filtration and classification

Page 21: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Means of creating user’s Means of creating user’s personal encyclopediapersonal encyclopedia

Avalanche includes Knowledge Database that

provides creation and management of user’s

personal encyclopedia built as a local Internet site

for adequate description and convenient

maintenance of information stored.

Page 22: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Example of creating user’s personal encyclopediaExample of creating user’s personal encyclopedia

Page 23: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

AvalancheAvalanche is a well-structured is a well-structured productproduct

Avalanche consists of:

Internet Spider to find necessary information

Internet Classifier for automatic semantic filtering of

data found

Knowledge Database representing convenient mini-

encyclopedia to deal with found and filtered

information

Page 24: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

AvalancheAvalanche is a flexible and is a flexible and scalable productscalable product

Avalanche could be a good fit either for

expert’s analytical work or for common user’s

Internet surfing.

Page 25: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Instruments and technologiesInstruments and technologies

Avalanche algorithms for data classification and texts proximity evaluation are developed on the strong mathematical basis.

Avalanche is developed with the proven technology that means following the standards for all stages of project maintenance, programming and testing.

Page 26: Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche

Different parts of Avalanche have been designed and developed using most up-to-date and efficient tools and algorithms.

User interfaces have been developed using Borland RAD tools. Core code is written using object-oriented approach which makes Avalanche highly configurable and flexible.

Class design has been developed using Rational Rose tools, which are considered to be the best OOP-design tools nowadays.

Database is designed and optimized to Normal Form III, that’s why data is stored efficiently, without any redundancy. Data integrity is declared and applied on database level.

Dictionary and document searching is optimized by using latest hashing and caching algorithms combined with the direct dictionary access.

Instruments and technologiesInstruments and technologies