Vector Space Model
操作:
1. 获取文件
cd $HOME/Downloads
wget http://archive.apache.org/dist/lucene/java/5.5.0/lucene-5.5.0.tgz
tar -xvzf lucene-5.5.0.tgz
2. cd Downloads/big-data-2/vector
3. run command
>> ./runLuceneQuery.sh data
>>./runLuceneTFIDF.sh data
Graph Data Model
Import a CSV file into Gephi
Perform statistical operations and layout algorithms on graph data in Gephi
NOTE: Gephi should be run on your native hardware, not in the Cloudera VM. Instructions for downloading, installing, and running Gephi can be found at https://gephi.org/users/install.
Step 1. Download and import CSV file. In your web browser, go to the following link:
https://raw.githubusercontent.com/words-sdsc/coursera/master/big-data-2/graph/diseaseGraph.csv
Click on File, and choose Save as to download the file
In Gephi, click on File, and choose Import spreadsheet:
Step 2. Examine graph properties.
In the middle pane, Gephi displays the graph.
The black circles are the nodes, and the lines between them are the edges.
Step 3. Perform statistical operations.
Below the Context pane, is the Statistics pane, where you can perform various statistical calculations.
We can calculate the average degree by clicking on Run next to Average Degree.