We developed a highly loaded data storage system based on the client’s requirements.
The scale of the system can be compared with Spotify. The system processes billions of files per day and receives approximately 600 thousand records per second.
Taking into account such data volumes, the distributed Cassandra database was used as storage. Several data centers located in different regions were used to ensure data integrity and load balancing.
As for billing, the system is not able to issue invoices itself, but it processes the data that is used for this purpose. For processing and calculating the distributed data, the development team used Apache Spark.
All data is processed and stored using microservices. Currently, there are about 20 microservices within the system.
To ensure system security, the SFTP protocol, Single Sign-on (SSO) technology, and data encryption are used.
At the initial stage, the project was developed according to the Scrum methodology with two-week sprints. Once a month the development team presented demo versions directly to the client.
At the stage of active project implementation, the development team switched to the Kanban methodology to optimize work processes.