Page 1 of 5
System Requirements Document (SRD)
1. Introduction
The GenBI System is designed to revolutionize how users interact with and manage datasets. By combining AI-powered analytics, chat-based interaction, and robust version control, the system provides an intuitive and production-ready architecture for data manipulation and insights. The system allows users to upload CSV files, interact with the data via chat, and track every change with a full audit trail, ensuring transparency and control. Additionally, the system will provide human-readable explanations for version history changes, highlighting the impact on the dataset for better clarity and usability.
2. System Overview
The GenBI System enables users to:
- Upload CSV files for storage and analysis.
- Interact with the data through a chat interface powered by AI.
- Gain insights, perform data modifications, and track all changes.
- Revert to any previous version of the dataset with a detailed version history.
- View human-readable summaries of changes in the version history panel, including the impact on the dataset (e.g., rows/columns affected).
- Access both quick tooltips and expandable details for deeper insights into each version's changes.
The system is built on a robust backend architecture using Python, FastAPI, and PostgreSQL, with a Docker-based infrastructure for scalability and reliability. The AI is powered by Google Gemini, acting as both a Data Analyst and Data Engineer.
Page 2 of 5
3. Functional Requirements as Story Points
- As a User, I should be able to upload a CSV file.
- As a User, I should be able to view the uploaded data in a preview table.
- As a User, I should be able to interact with the data via a chat interface.
- As a User, I should be able to ask questions about the data, such as trends, aggregations, and outliers.
- As a User, I should be able to request data modifications, such as removing duplicates, deleting rows, adding columns, or updating values.
- As a User, I should be able to see the results of my queries and modifications.
- As a User, I should be able to view a version history panel with summaries of changes.
- As a User, I should be able to see a more human-readable explanation of changes in the version history panel.
- As a User, I should be able to see how changes impacted the dataset, such as the number of rows or columns affected.
- As a User, I should be able to revert to any previous version of the dataset.
- As a User, I should be able to view the SQL executed for any change.
- As a User, I should be able to see a detailed audit log of all actions.
- As a User, I should be able to view quick tooltips for version history summaries.
- As a User, I should be able to expand version history entries to see detailed human-readable explanations and impact summaries.
4. User Personas
1. Data Analyst
- Goals: Gain insights from data, perform aggregations, and identify trends.
- Needs: A user-friendly interface for querying data and generating insights.
- Pain Points: Manual data analysis and lack of intuitive tools.
2. Data Engineer
- Goals: Modify datasets, clean data, and ensure data integrity.
- Needs: Tools for data cleaning, updating, and version control.
- Pain Points: Lack of robust versioning and audit trails.
3. General User
- Goals: Upload data, interact with it, and track changes.
- Needs: An intuitive chat interface and clear version history.
- Pain Points: Difficulty in understanding technical logs and reverting changes.
Page 3 of 5
4. Power User
- Goals: Quickly perform advanced operations and revert changes efficiently.
- Needs: Command-based interactions like "undo last change" and detailed version history logs.
- Pain Points: Limited flexibility in reverting changes and lack of detailed impact summaries.
5. Visuals Colors and Theme
The GenBI System will use a modern, clean, and professional theme:
- Primary Colors: Deep Blue (#003366) and White (#FFFFFF) for a professional look.
- Secondary Colors: Light Gray (#F5F5F5) for backgrounds and accents, and Green (#28A745) for success indicators.
- Highlight Colors: Orange (#FFA500) for warnings and Red (#FF0000) for errors.
- Typography: Sans-serif fonts like Open Sans or Roboto for readability.
- Icons: Minimalistic, flat icons for tools and actions.
6. Signature Design Concept
Interactive Data Universe Landing Page
The homepage will immerse users in an interactive galaxy of data. The screen will feature a 3D rotating universe where each star represents a dataset or tool. Users can:
- Hover over stars to see tooltips with dataset names or tool descriptions.
- Click on stars to "zoom in" and access specific datasets or tools.
- Navigate between stars using a smooth, physics-based drag-and-drop motion.
The background will feature a dynamic starfield animation, subtly shifting colors based on the time of day (e.g., warm tones at sunrise, cool tones at night). Micro-interactions include:
- Stars pulsing gently when hovered over.
- A comet-like animation when transitioning between sections.
- A glowing "central star" representing the chat interface, inviting users to interact.
The landing page will also include a floating toolbar for quick access to file uploads, version history, and settings. This concept ensures the first impression is both visually stunning and functionally engaging.
Page 4 of 5
7. Non-Functional Requirements
- Performance: The system must handle datasets up to 1GB in size with minimal latency.
- Scalability: The architecture must support multiple concurrent users.
- Security: All SQL queries must be validated and parameterized to prevent SQL injection.
- Reliability: The system must maintain a 99.9% uptime.
- Usability: The interface must be intuitive for both technical and non-technical users.
- Clarity: Version history must include human-readable summaries of changes and their impact on the dataset.
- Accessibility: The system must support tooltips and expandable details for version history entries.
8. Tech Stack
Frontend
- React for Web
- Chat Interface
- File Upload UI
- Version History Panel
- Revert Button
- Data Table Preview
Backend
- Python
- FastAPI
- LangGraph
- Google Gemini API
Database
- PostgreSQL (Docker-based)
- SQLAlchemy for ORM
- Alembic for migrations
Infrastructure
Page 5 of 5
AI Tools
- Google Gemini API for AI interactions
- LangGraph for tool-based architecture
9. Assumptions and Constraints
- Users will primarily interact with the system via the chat interface.
- The system will only support CSV file uploads initially.
- PostgreSQL will be used for all data storage and versioning.
- All AI interactions will be powered by Google Gemini.
- Human-readable explanations for version history changes will be generated dynamically.
- Impact summaries will include details such as the number of rows or columns affected.
10. Glossary
- CSV: Comma-Separated Values, a file format for tabular data.
- LangGraph: A tool-based architecture for AI workflows.
- Google Gemini: An AI model used for data analysis and interaction.
- Version History Panel: A UI component showing the history of changes made to the dataset.
- Audit Log: A record of all actions performed on the dataset.
- Human-Readable Explanation: A simplified summary of changes made to the dataset, such as "Removed duplicates in 'sales_data' (affected 200 rows)."
- Impact on Dataset: A description of how a change affected the dataset, such as the number of rows or columns modified.
- Tooltip: A small pop-up that provides quick information when hovering over an element.
- Expandable Details: A UI feature that allows users to click and view more detailed information about a specific item.
This updated SRD incorporates the requirement for human-readable explanations of version history changes, impact summaries, and their display in the history panel, ensuring clarity and usability for all users.
No comments yet. Be the first!