RepoInsight is a powerful tool designed to analyze the file structure of GitHub repositories and visualize the relationships between files. It helps developers gain insights into the file system and its dependencies, providing a comprehensive overview of how various components interconnect. RepoInsight focuses on explaining the repository's architecture, file roles, and relationships rather than just giving a code-level explanation.
With a user-friendly front-end built using Next.js and a robust back-end powered by Python Flask, RepoInsight offers a seamless interface for both casual users and seasoned developers to explore and understand their repositories.
- Directory Structure Analysis: Provides a detailed breakdown of the repository’s folder hierarchy and the purpose of each file and directory.
- File Role Classification: Automatically classifies files into categories such as source code, configuration files, documentation, test files, and assets.
- Dependency Mapping: Shows how files are connected through imports, requires, includes, and configuration references.
- Asset Usage Detection: Identifies how media files like images, fonts, and other resources are used across the project.
- Build and Configuration File Insights: Provides analysis of
Dockerfile
,Makefile
,package.json
, and other configuration files to show how they impact the repository. - Visual Graph Representation: Displays relationships between files in an interactive graph to provide a holistic view of the repository.
- Multi-Language Support: Currently supports analyzing repositories in Python, JavaScript, and more.
RepoInsight leverages a Next.js front-end to provide an intuitive UI for users to upload and analyze repositories. The back-end, built with Python Flask, handles file parsing, dependency analysis, and relationship mapping.
- Provides an interactive dashboard to explore file structure and relationships.
- Receives user input (such as a GitHub repository URL) and sends it to the backend for processing.
- Displays visual representations of the file system and relationships using libraries like D3.js or Graphviz.
- Fetches and processes repository data.
- Recursively scans and categorizes files based on their types.
- Analyzes import and dependency relationships using language-specific parsers (e.g., Python’s
ast
, JavaScript regex, etc.). - Maps configuration files and their impact on the source code (e.g.,
Dockerfile
,requirements.txt
,package.json
). - Communicates results back to the frontend for display.
Follow these instructions to set up and run RepoInsight on your local machine.
- Node.js and npm/yarn installed for Next.js frontend.
- Python 3.7+ installed for the Flask backend.
- Git for cloning repositories.
- Graphviz (optional but recommended) for generating file relationship graphs.
-
Clone the Repository
git clone https://github.com/sujin-sreekumaran/RepoInsight.git cd repoinsight
-
Install Frontend Dependencies (Next.js)
cd client npm install # or yarn install
-
Install Backend Dependencies (Flask)
cd ../server pip install -r requirements.txt
-
Start the Flask Backend
cd server flask run
This will start the Flask API server on
http://127.0.0.1:5000/
. -
Start the Next.js Frontend
cd client npm run dev # or yarn dev
This will start the frontend on
http://localhost:3000/
. -
Access the Application
Open your browser and navigate tohttp://localhost:3000
to start analyzing repositories.
-
Upload a GitHub Repository:
Enter the GitHub repository URL you want to analyze in the provided input field and hit "Analyze." -
View Directory Structure:
Once analyzed, you’ll see a tree-like structure showing all the directories and files in the repository, along with detailed explanations of each file's role. -
Explore File Relationships:
A visual graph will be generated showing the dependencies between files (e.g., which files import or include other files). -
Inspect Configuration and Build Files:
Configuration files likeDockerfile
,Makefile
,package.json
, andrequirements.txt
will be parsed, showing how they impact the codebase and build process.
RepoInsight classifies files into the following categories:
- Source Code: Core application code, such as Python files (
.py
), JavaScript files (.js
), etc. - Configuration Files: Files that define application settings or dependencies (
.json
,.yml
,Dockerfile
, etc.). - Documentation: Files that describe the project (
README.md
,LICENSE
, etc.). - Assets: Media files like images, fonts, and icons.
- Test Files: Files used for testing the application.
Each of these categories is provided with an explanation, allowing users to understand their purpose within the project.
RepoInsight uses an advanced analysis engine to map file dependencies. The following relationships are identified and visualized:
- Imports/Includes: How one file imports modules or libraries from another.
- Asset Usage: How assets like images are referenced in HTML, CSS, or JavaScript.
- Build Configuration: How build-related files (e.g.,
Dockerfile
,package.json
) tie into the project.
These relationships are displayed in a graph format, making it easy to understand the internal architecture of the repository.
We welcome contributions from the community to improve RepoInsight. To contribute:
- Fork the repository.
- Create a feature branch.
- Commit your changes.
- Open a pull request, describing the changes you’ve made.
Feel free to raise issues or feature requests, and we will review them as soon as possible.
This project is licensed under the MIT License - see the (LICENSE) file for details.
We plan to introduce several exciting features in future releases:
- Support for more languages: Extend file and dependency analysis to more programming languages like Java, Ruby, PHP, and others.
- CI/CD Integration: Allow users to automatically analyze repositories during the build process.
- Security Audits: Perform security checks to identify vulnerabilities in repository dependencies.
- Customizable Reports: Enable users to export file system and dependency reports in formats like PDF or Markdown.
RepoInsight offers developers a clear, structured, and easy way to explore the architecture of repositories. Whether you are working on an open-source project or analyzing a large-scale codebase, RepoInsight makes the process seamless and insightful.
Feel free to customize the README to fit any specific needs or features you wish to highlight! This documentation provides a clear overview of the tool and how to get started.