About the project
Project Infocrystal
declared as knowledge Base
natural language model of the world.
In the modern world volume
information available to man grows as avalanche. Unfortunately, in
this volume is very much foam, and outright disinformation, the person all
harder to find reliable information. The results of search engines
requires in-depth analysis, provide a lot of duplicate links.
The accuracy of the information provided is subjective. Encyclopedia
article wikipedia.org
has great educational significance, but are very static, though
constantly updated and edited by volunteers from around the world.
Articles about material objects have almost no physico-chemical data
at best, statistical and geographical, for which it is difficult
to simulate the object.
As a result of implementation of the project Infocrystal
it is assumed
get knowledge Base
with claims of artificial intelligence.
The knowledge base consists of Database objects, supporting
hierarchy and heritability properties, and policy
for
objects. In the formation of knowledge Base
it is assumed
use a trained neural network situational and means theory
fuzzy sets.
As a result, the natural language query,knowledge Base
will form a natural language answer
encyclopedic quality in the scope of coverage of knowledge in accordance with
request. The formation of relevant analytical reviews on
a variety of topics, the solution of technical problems.
The project Infocrystal involves several self-sufficient intermediate stages:
-
The preliminary stage
At this stage, is implemented three-tiered schema of the client-server interaction. At the moment this stage is completed, the application server is implemented. In addition to service client applications and authenticate users, the application server performs all periodic tasks. At this stage implemented jobs on Java and PL/SQL. In the future support tasks in natural language. -
Library
Strategy of formation of the Base knowledge is to use for the initial generation reliable information. Such information has always been printed edition. So Library the project plays an important role. Currently the file storage project can accommodate more than 20 million files totaling 28Tb. Storage is replenished through the queue downloads and periodic tasks of the application server. Destruction of copies, analogues and inappropriate content occurs during check-in file storage. Important area at this stage - OCR scans books, as well as speech synthesis and speech recognition. The main task is classification of books and converted to text format. The main form ofLibraries adopted FB2, other formats possible converted. The project is not planned to provide access to download books, not to infringe copyright. At the moment, the project is at this stage. -
Encyclopedia
After the development of the architecture of the Base objects start compiling natural language information Libraries dictionary and identification of objects and properties, the construction of scenes the interaction of objects. All objects and properties are logged to Base objects, on the basis of which the training of the neural network for automatic recognition of new objects and words. As the formation of the Database objects, possible external publication in the form encyclopedia. -
Knowledge base
As perception compiled natural-language information and build scenes happens the filling of policy on reliable information, training neural networks. As a result of accumulation of knowledge, there is a possibility to perceive false information. Important development of algorithms or neural network recognition of scenes, identify inconsistencies, inaccuracies and lies. This stage will require great research work.
Project dependencies
- Library of neural networks Encog
<dependency>
<groupId>org.encog</groupId>
<artifactId>encog-core</artifactId>
<version>3.3.0</version>
</dependency> - Library PDF iText
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itextpdf</artifactId>
<version>5.5.7</version>
</dependency>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itext-pdfa</artifactId>
<version>5.5.7</version>
</dependency>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itext-xtra</artifactId>
<version>5.5.7</version>
</dependency> - Library work with MS Word, Excel POI
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>3.13</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>3.13</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-scratchpad</artifactId>
<version>3.13</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml-schemas</artifactId> - Extension the supported image formats TwelveMonkeys
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-core</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-jpeg</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-tiff</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-psd</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-metadata</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-bmp</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-pnm</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-icns</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-pict</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-tga</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-sgi</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-pcx</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-pdf</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-iff</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.common</groupId>
<artifactId>common-image</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.common</groupId>
<artifactId>common-io</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.common</groupId>
<artifactId>common-lang</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-thumbsdb</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-batik</artifactId>
<version>3.3.2</version>
</dependency> - Library Apache Batik to the vector SVG format
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-bridge</artifactId>
<version>1.7</version>
</dependency>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-dom</artifactId>
<version>1.7</version>
</dependency>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-css</artifactId>
<version>1.7</version>
</dependency>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-ext</artifactId>
<version>1.7</version>
</dependency>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-gui-util</artifactId>
<version>1.7</version>
</dependency>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-parser</artifactId>
<version>1.7</version>
</dependency>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-util</artifactId>
<version>1.7</version>
</dependency>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-awt-util</artifactId>
<version>1.7</version>
</dependency>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-gvt</artifactId>
<version>1.7</version>
</dependency>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-transcoder</artifactId>
<version>1.7</version>
</dependency>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-script</artifactId>
<version>1.7</version>
</dependency>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-svg-dom</artifactId>
<version>1.7</version>
</dependency>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-xml</artifactId>
<version>1.7</version>
</dependency>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-svggen</artifactId>
<version>1.7</version>
</dependency> - File database H2
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<version>1.4.191</version>
</dependency> - Library search engine Lucene
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>5.5.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analyzers-common</artifactId>
<version>5.5.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-queryparser</artifactId>
<version>5.5.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-memory</artifactId>
<version>5.5.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-highlighter</artifactId>
<version>5.5.0</version>
</dependency> - Library index and search images LIRE, a little corrected.
- Library processing videos and images openCV 3.2.0.