LynxKite is a powerful open-source analytics tool for very large graphs and other datasets.
It scales to billions of edges thanks to the underlying Apache Spark cluster computing engine.
It seamlessly combines the benefits of a friendly graphical interface and a powerful Python API.
- Hundreds of scalable graph operations, including graph metrics like PageRank, embeddedness,
and centrality, machine learning methods including
GCNs, graph segmentations like modular
clustering, and various transformation tools like aggregations on neighborhoods.
- The two main data types are graphs and relational tables. Switch back and forth between the two
as needed to describe complex logical flows. Run SQL on both.
- A friendly web UI for building powerful pipelines of operation boxes. Define your own custom boxes
to structure your logic.
- Tight integration with Python lets you implement custom transformations or create whole
workflows through a simple API.
- Integrates with the Hadoop ecosystem. Import and export from CSV, JSON, Parquet, ORC, JDBC, Hive,
- Fully documented.
- Proven in production on large clusters and real datasets.
- Fully configurable graph visualizations and statistical plots. Experimental 3D and ray-traced
All of these features are included in our open-source (AGPL) release.
We also offer an enterprise version with the following additions:
- Collaboration features for multiple users on a shared LynxKite instance.
- OAuth and LDAP integration.
- Fine-grained access control.
- Support and professional services.
LynxKite is under active development.
Check out our Roadmap to see what we have planned for future releases.
Algorithms in LynxKite
- Edge graph
- Random graphs (e.g. scale-free)
- Snowball sampling
- Random walks
Centrality & other metrics
- Harmonic centrality
- Lin centrality
- Closeness centrality
- Local clustering coefficient
- Dispersion of connections
- Edge embeddedness
- Information communities
- (Strongly) connected components
- Maximal cliques
- Modular clustering
- Community processing
- Graph diffusion operators
- Graph diffusion through communities
- Triadic closure
- Viral modelling
- Graph convolutional networks (GCN)
- Decision trees
- k-means clustering
- Logistic regression
- Linear regression
- Pearson correlation coefficient
- Steiner tree
- Vertex coloring
- Neighborhood fingerprinting
- Edge prediction via hyperbolic mapping
- Shortest path distance from a set