Text Analytics
The R project for statistical computing www.r-project.org/
- Text mining package – “tm”: http://cran.r-project.org/web/packages/tm/index.html
- OpenNLP interface – “openNLP”: http://cran.r-project.org/web/packages/openNLP/index.html
- CRAN task view: Natural Language Processing: http://cran.r-project.org/web/views/NaturalLanguageProcessing.html
openNLP is a center for open source projects related to natural language processing http://opennlp.sourceforge.net/
- openNLP projects http://opennlp.sourceforge.net/projects.html
Topic models
- http://nlp.stanford.edu/software/tmt/tmt-0.3/ (Java, by Ramage & Rosen in Manning & Jurafsky’s group)
- http://psiexp.ss.uci.edu/research/programs_data/toolbox.htm (Steyvers & Griffiths)
Information Visualization
- d3: a javascript-based library for document visualization and interaction http://d3js.org/
- Paraview, an open-source turnkey application for analyzing and visualizing scientific data sets http://www.paraview.org/
- Protovis, a Javascript toolkit for building Web application http://vis.stanford.edu/protovis/
- VTK, a very popular toolkit for building scientific visualization and informatic applications http://www.vtk.org/
- Titan, extending VTK to provide analytics functionality http://titan.sandia.gov/
- VisTrails, a scientific workflow and provenance management system http://www.vistrails.org/index.php/Main_Page
- VisMashups, providing an easy way to deploy VisTrails workflows over the Web http://www.cs.utah.edu/~juliana/pub/mashup-vis2009.pdfand http://www.vistrails.org/index.php/ProvenanceAnalytics
- Voreen, an interactive visualization environment http://www.voreen.org/
Project management
- Trac Open Source Project http://trac.edgewall.org/
- Redmine http://www.redmine.org/
High-performance computing
- Apache Hadoop: http://hadoop.apache.org/