At Cloudbox Labs, we think logs is an incredibly interesting dataset. They are the heart beats of our tech stack. In this post we built a robust set of data infrastructure that can handle large volume of logs from all our applications, and allow for real time analytics as well as batch processing.
Google Cloud has fully managed services that allow end users to build big data pipelines for their analytical needs. In this post, we are going to build a data pipeline that analyzes real time stock tick data streamed from gCloud Pub/Sub, runs them through a pair correlation trading algorithm, and outputs trading signals onto Pub/Sub for execution.
In recent years, Apache Kafka has become the technology of choice when it comes to working with streaming data. In this post we will use Apache Kafka to build a real time NYC subway tracker that shows you when the next train will arrive in the station.