Modernizing Object Storage for Cloud Native Deployments

George Nychis, Vageesh Hoskere & Wolfgang Richter

Why should you care?

Data storage is a universal need. Structured data goes into familiar stores like an RDBMS (PostgreSQL, MySQL, Oracle), but unstructured data can be housed in many ways. For example, object storage systems, key-value caches, document stores (if there’s some structure), and even flat files on a file system.

This article details how and why your choice of unstructured storage:

  • affects your scalability by making or breaking your cloud native capability,
  • balloons your software maintenance cost, and
  • limits the possible savings you could get on your cloud storage expenses.

We take you on our journey from a home grown, flat-file-based object storage layer, to an off-the-shelf approach with MinIO which saved us 30% in storage costs and 90% in maintenance costs.

Object Storage at Scale

Soroco’s product Scout collects millions of data points every day from interactions between teams and business applications, during the natural course of a workday. From the collected interactions, Scout detects patterns in the data using various machine learning algorithms to help our customers find opportunities for operational improvement.

Below, we show the flow of this information via an example using Scout events. Scout events are represented as JSON objects that are buffered in memory and then periodically stored in compressed, encrypted JSON files on disk. Compression minimizes the network bandwidth and storage requirements. Encryption protects sensitive data at rest and in flight. Buffering saves compute resources by batch processing events. These services can be in the cloud, or on premises if the customer prefers it. Scout’s data ingestion services then decrypt and decompress the JSON data after which the individual records are post-processed and stored in an RDBMS. The records can then be fetched by our various machine learning algorithms.

A sketch of information flow in Scout

We must store the original data though, because post-processing might transform or accidentally drop data that we find useful in the future. For example, an updated machine learning algorithm might want a re-interpretation of the features from the original samples. If we threw them away after post-processing, we could never go back to the original data to improve results. Of course, we also have to store screenshots somewhere, and our RDBMS did not seem like a good choice. A contributor to PostgreSQL benchmarked the performance of object storage in PostgreSQL as compared to disk and found a 10x slowdown in a read-based benchmark. You don’t want to store objects in PostgreSQL!

A typical large scale deployment spanning 100s of teams and 1000s of users ingests approximately 2B objects equating to approximately over 130 TB per year (assuming 261 working days). The post-processed structured information stored in our RDBMS is orders of magnitude smaller because it is just the output of a feature engineering pipeline for machine learning algorithms.

In addition to the storage needs, the total set of requirements we had come up with when looking for an object storage solution were:

  • Handle our storage requirements of objects at scale
  • Decouple storage from our local file system for reducing cost and maintenance
  • Provide compression to reduce storage requirements
  • Minimal maintenance requirements from our engineering team
  • Support for detailed access control lists to protect the original data files
  • Simple integration with cloud native storage services such as Amazon S3 and Azure Blob Storage
  • Local storage if cloud native storage services are not available (e.g., for on-premises)

A solution meeting all these requirements ought to be both – cloud-native and scalable. This would let our product handle substantial retention periods (1 year or more), on-demand random access read workloads, and all of the deployment scenarios we care about (bare metal, private cloud, public cloud). In the remainder of this blog article, we present the different approaches and trade-offs which lead to our final solution which saved us 30% in storage costs and 90% in maintenance costs.

Considering our Options for Object Storage

There are a few common options for object storage that we considered while evaluating different designs to meet our requirements.

Filesystem-based Object Storage with References

A low-complexity solution to object storage is to store objects on the disk and keep references to the available objects with any important metadata in a database or index. Git is well known for doing this and implementing a style of it called content-addressable storage (CAS). An example of this is illustrated below.

As illustrated with a CAS system, objects are stored on the filesystem by their hash and any meta-data associated with them can be stored in a database or catalog.

Benefits of file-based object storage are simplicity in design, and if CAS is used you will get de-duplication of objects for more efficient storage since multiple references can map to the same object on disk. No specialized systems are required to track the objects, and access to them will be as easy as filesystem reads.

Downsides of the filesystem-based object storage are maintenance, lack of access control without building or using a more substantial system around it, and inaccessibility to a shared filesystem in modern cloud native deployments where services do not assume local storage. Though you could mount a network share, the performance impact of using an NFS share would likely be substantial. For these reasons, we believe that while this approach is fast and has simplicity, it does not meet a lot of our requirements.

Distributed Object Storage

To keep the benefits of filesystem-based object storage and overcome the limitations around access to the storage, distributed object storage systems such as Ceph and Swift were built. Their design is illustrated below, where a “storage cluster” is built by distributing objects across any number of block devices (e.g., bare metal disks). This storage is then made accessible through microservices with network accessible APIs to store and retrieve blocks, and fine-grained access control.

An example distributed object storage deployment with Ceph

Benefits of distributed object storage systems such as Ceph and Swift are their ability to scale in storage and bandwidth by adding more disks to the cluster. They have significant controls exposed around the distribution of objects across block devices to achieve redundancy if desired (e.g., through erasure coding), cloning, snapshotting, and thin provisioning for efficiency.

Downsides of distributed object storage systems are their complexity. Their design is focused on multiple block devices with multiple binaries and services to run and maintain the storage system. This would add a lot of complexity for our product Scout which operates in many single storage block devices scenarios on-premises. This would further require significantly more knowledge for the users of our product to operate the storage system and any issues with it.

If you enjoy reading this article and want to work on similar problems, apply here and come work with us!

Cloud-based Object Storage

Cloud-based solutions such as Amazon S3 and Azure Blob Storage provide highly reliable object storage services with minimal setup, high availability, and zero infrastructure maintenance. How the objects are stored, maintained, and distributed is completely hidden behind simple APIs to store and retrieve your objects. Using Azure Blob Storage as an example, you will use an account with Microsoft’s Azure platform to create containers for your objects. The containers are like virtual “folders” that have access controls around them and allow you to group your objects together.

Benefits of the cloud-based object stores are the simplicity they provide in managing everything for you, where your systems will only need to access them via APIs to store and retrieve objects. You do not need to worry about maintaining any services to keep the object stores running, and you get the benefits of the cloud to keep accessing more storage capacity as you need it, fine-grained access controls per container, and the ability to deploy in multiple regions to minimize latency. Cost of the object stores is often more favorable than increasing the size of your primary partitions to store information (e.g., in a filesystem-based object storage approach), since primary partitions are often faster storage options (higher IOPS) at higher costs.

There are downsides to the cloud-based objects stores. Your product or solution must have network connectivity to the cloud. Our product Scout, as an example, is deployed in many on-premises scenarios behind firewalls and within private networks with no connectivity guaranteed. Therefore, we could not base our entire solution on cloud-based storage without guarantees we would have connectivity to the services.

Bringing the Options Together with Hybrid-Cloud Object Storage and MinIO

We have presented three system designs for object storage. Filesystem-based, distributed, and cloud-based object storage. Each approach comes with varying degrees of maintenance, scalability, detailed access control, and ease of on-premises operation. We have summarized this below and will introduce hybrid-cloud object storage in comparison to the previously discussed designs.

Hybrid-cloud object storage provides a unique combination of capabilities to give cloud native object storage (e.g., in S3) when desired, with the flexibility of operating on-premises. Though there may be other options available that we are not aware of (please let us know!), we have found MinIO to be leading hybrid-cloud object storage functionality. As we will show through MinIO, switching between cloud-based object storage and an on-premises operational model only takes a configuration change. This means that you use hybrid-cloud object storage technology like MinIO in your product/technology to build how you store objects in one way but be flexible to many operational scenarios.

MinIO uses a single binary for operations and a single service for each server in distributed mode. A single binary instead of multiple services that will require setup and maintenance is a significant advantage to using the particular hybrid-cloud object storage approach with MinIO, as opposed to Ceph and Swift that are more complex to setup and maintain. At Soroco, this was a predominant reason for why we chose MinIO and deprecated some of our own internal services related to storage to reduce our maintenance costs related to object storage by 90%.

Another major benefit of the hybrid-cloud object storage approach for Soroco is Cloud Native with MinIO via a Kubernetes Operator which is supported directly by the core MinIO team. This means that it has built-in support to self-manage, self-scale (e.g., obtain more storage as needed), and to self-heal any services that fail. What cloud native support means with MinIO is:

  • High availability of its runtime (self-healing of its service)
  • Zero downtime upgrades
  • Backups and restores
  • Resource scaling (disk, compute, network, memory…)
  • Permissions and security controls

Below we will show how this flexibility and scalability of MinIO. First, we will show a simple example of it running in a single node with local storage. Then, reconfiguring it to operate through an S3 bucket. To help our readers we will link to an example of running it in distributed mode. Finally, we will provide guidance on how to simply migrate to MinIO based on our experience.

Bringing the Options Together with Hybrid-Cloud Object Storage and MinIO

Getting started with hybrid-cloud object storage is simple. We will use MinIO in our examples. As mentioned, MinIO deploys a single binary which reduces complexity of operation and maintenance. Below, we download the single MinIO binary and start the server to run locally with the data stored at /mnt/data.
# Fetch the single `minio` binary and make it executable.
chmod +x minio

# Start the MinIO server with local storage for data, replacing the path `/mnt/data` as needed.
./minio server /mnt/data 

Another option to get the single server up and running configured with local storage is to pull and start the MinIO maintained Docker container:

docker run \
  -p 9000:9000 \
  -p 9001:9001 \
  --name minio_local \
  -v /mnt/data:/data \
  minio/minio server /mnt/data --console-address ":9001"
A quick way to check the accessibility of the object storage is to navigate to the MinIO object browser which we started The login credentials are the root user credentials set. If run with no root user configuration, the default credentials to be used are minioadmin:minioadmin. This will navigate you to the main console where you can monitor the health of the object storage and optionally use the UI to setup users with access control.

Next, let’s use the Python MinIO client library to connect to the storage, store an object that we will create, and retrieve it back to test the end-to-end storage process.  First, install the Python MinIO client with pip:

pip install minio

Create a Python file for the following example which you will be able to run with your MinIO server running locally:

import urllib3
from minio import Minio

# client connection options
timeout = urllib3.Timeout.DEFAULT_TIMEOUT
secure = False
hostname = ""
access_key = "minioadmin"
secret_key = "minioadmin"

# Create a http client pool that can be used by the service
httpClient = urllib3.PoolManager(timeout=timeout, maxsize=MAX_POOL_SIZE,)
if secure:
    httpClient = urllib3.PoolManager(
            total=5, backoff_factor=0.2, status_forcelist=[500, 502, 503, 504]
# Create a client with the MinIO server, its access key and secret key. 
# The host detail is obtained through configuration. The access and secret keys 
# are stored and retrieved from any secret manager.  In Soroco, we use 
# Vault (
client = Minio(

# Create a bucket for testing object storage if it does not exist.
bucket = "teststorage"
if not client.bucket_exists(bucket):

# Store a file object that we create and retrieve it back.
with open("/tmp/hello-object.txt", "w") as f:
    f.write("Hello MinIO")

# Store the file in object storage
client.fput_object(bucket, "hello-object.txt", "/tmp/hello-object.txt")

# Retrieve the stored object using the object name and bucket
client.fget_object(bucket, "hello-object.txt", "/tmp/hello-object-retrieved.txt")

Through the above example, you should have been able to get a local object store running with MinIO with its single binary. To access that store anywhere on the network, the network port (9000 in our example) just needs to be accessible.

Setting the Hybrid-Cloud Object Storage to Use a Cloud-based S3 Container

Configuring the hybrid-cloud solution with MinIO to use a cloud-based container is simple. Since Minio is S3 compatible, all that is needed is to set the host and the access credentials to S3 as shown below.

client = Minio(
    “”, 		#S3 hostname
    access_key=”<YOUR_S3_ACCESS_KEY”,	#S3 access key
    secret_key=”<YOUR_S3_SECRET_KEY>”,	#S3 secret key

Once configured, the usual set of client commands can be used as demonstrated in the example from the previous section to create buckets, put files, and get files from the object store.

Migrating to MinIO and then to Cloud-based Storage

To migrate your data to MinIO independent of endpoint, you will want to iterate over your objects wherever they may be and put them all into the bucket you have created. While you can do this with the Python client or those for other languages, there is also the MinIO client named mc. This client provides operations on the object storage with UNIX commands like UNIX commands like ls, cat, and cp.
This means you can use mc to copy objects simply to MinIO and your buckets with the cp command as follows, taken from the mc documentation:
mc cp myobject.txt play/mybucket
myobject.txt:    14 B / 14 B  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓  100.00 % 41 B/s 0
What the mc command also allows is for you to mirror a local destination to cloud-based object store like S3 as follows:
mc alias set s3 <S3-ACCESS-KEY> <S3-SECRET-KEY>
mc mirror /mnt/data s3
At Soroco, we have also found this to be useful for migrating local object stores to the cloud. By setting the mirror and allowing the data to copy over, the S3 bucket will then be usable as a remote endpoint and the local storage could be retired. This is particularly useful if an on-premises deployment migrates to the cloud, illustrating the power of the hybrid-cloud solution for object storage.

Concluding on Modernizing our Object Storage with Hybrid-Cloud Solutions

In this blog post, we presented multiple ways to achieve object storage with their trade-offs, and how hybrid-cloud object storage can operate on-premises when suitable and cloud-natively for properties of self-scale, self-healing, and self-managing when desired.

At Soroco, we have found this hybrid-cloud object storage approach, particularly through MinIO, to give us tremendous flexibility in where we deploy our technology. It gives us scale and simplicity to minimize our operational overhead of maintaining our object store. With simple to use tooling and libraries provided by MinIO, we can easily integrate the access to the store in our code, flexibly change the backend, and even migrate data across different operational models (e.g., can migrate to on-premises or to the cloud as needed).

We would love to hear more about your experiences with object storage, any of the technologies that we have mentioned, and especially others that we may have missed and should consider in our (or other’s) journey.

Like this article? Spread the word 

Share on facebook
Share on twitter
Share on linkedin
Share on reddit
Share on mix
Share on email

Content Explorer

632 Responses

  1. Hi there just wanted to give you a quick heads up. The text in your article seem to be running off the screen in Opera.
    I’m not sure if this is a format issue or something to do
    with internet browser compatibility but I figured I’d post to
    let you know. The style and design look great though!
    Hope you get the issue solved soon. Kudos

  2. I need to to thank you for this great read!! I definitely loved every little
    bit of it. I have got you bookmarked to check out new stuff
    you post…

  3. I think that is among the so much important information for
    me. And i’m glad reading your article. But want to observation on some common things, The web
    site style is ideal, the articles is in reality great
    : D. Excellent task, cheers

  4. Hi! I could have sworn I’ve been to this site
    before but after reading through some of the post I realized it’s new to me.
    Anyways, I’m definitely happy I found it and I’ll be bookmarking and checking back often!

  5. Thanks for ones marvelous posting! I seriously enjoyed reading it, you will be a
    great author. I will remember to bookmark your blog and
    will eventually come back at some point. I want to encourage continue
    your great posts, have a nice weekend!

  6. Have you ever considered creating an ebook or guest authoring on other blogs?
    I have a blog centered on the same subjects you discuss
    and would really like to have you share some stories/information. I know my readers would enjoy your work.
    If you are even remotely interested, feel free to shoot me an e mail.

  7. I’m really enjoying the theme/design of your site. Do you ever run into any browser compatibility issues?
    A couple of my blog visitors have complained about
    my blog not working correctly in Explorer but looks great
    in Safari. Do you have any solutions to help fix this issue?

  8. Thanks for your whole work on this web page. Kate really likes managing research and it’s really simple to grasp why. Most people hear all regarding the compelling ways you deliver useful suggestions via this blog and even recommend participation from website visitors on this matter plus our own girl is now learning a lot. Enjoy the rest of the new year. You have been carrying out a great job.

  9. Excellent article. Keep writing such kind of info on your page.
    Im really impressed by your site.
    Hey there, You have done an excellent job. I will certainly digg it
    and individually suggest to my friends. I’m sure
    they’ll be benefited from this site.

  10. I think that what you published was very reasonable.
    But, think on this, suppose you composed a catchier title?

    I mean, I don’t wish to tell you how to run your website, but suppose you added something that makes people want more?

    I mean Modernizing Object Storage for Cloud Native Deployments | Soroco Engineering is a little
    boring. You ought to peek at Yahoo’s home page and watch how they create
    article headlines to get viewers interested. You might add a video or a related picture or two
    to grab readers interested about everything’ve written. In my opinion, it would bring your blog a
    little bit more interesting.

  11. I’m really enjoying the theme/design of your web site.

    Do you ever run into any browser compatibility issues?
    A few of my blog audience have complained about my blog not operating correctly in Explorer but looks great
    in Opera. Do you have any tips to help fix this problem?

  12. I was curious if you ever thought of changing the layout of your site?
    Its very well written; I love what youve got to say. But
    maybe you could a little more in the way of content so people could connect with it better.
    Youve got an awful lot of text for only having 1 or two images.
    Maybe you could space it out better?

  13. I believe this is among the so much important info for me.
    And i’m glad reading your article. But should observation on some basic things, The website style is perfect, the articles
    is truly great : D. Good job, cheers

  14. Takipçi Satın Almak Nedir?

    Pek çok kişinin takipçi sayın alma hizmetinden haberi olmadığını düşünerek bu yazıyı kaleme
    almak istedik. Özellikle sosyal medya hesapları üzerinden gerek satış
    yaparak gerekse de reklam gelirleri ile para kazanmak isteyen kişiler
    için takipçi sayısı oldukça önemlidir.
    Takipçi sayısı düşük olan bir hesabı takip etmek
    ister misiniz?
    Pek çok kişi bu soruya yanıt olarak hayır demektedir.

    Takipçi sayısı yüksek olan sosyal medya hesaplarının takip
    edilme şansı çok daha yüksektir. Bu nedenle oluşturduğunuz sosyal medya hesabınız
    için takipçi satın al hizmetleri sunulmaktadır.

  15. Woah! I’m really enjoying the template/theme of this website.
    It’s simple, yet effective. A lot of times it’s tough to get that “perfect balance” between superb usability and visual
    appeal. I must say that you’ve done a excellent job with this.
    Also, the blog loads extremely quick for me on Opera.
    Outstanding Blog!

  16. An interesting discussion is worth comment.
    I do believe that you need to publish more on this topic, it might not be a taboo matter but typically people do not discuss these issues.

    To the next! Many thanks!!

  17. You actually make it seem so easy with your presentation but I find
    this matter to be actually something which I think
    I would never understand. It seems too complex and very broad for
    me. I am looking forward for your next post, I’ll try to get the hang of

  18. Unquestionably believe that which you stated. Your favorite justification seemed
    to be on the web the easiest thing to be aware of. I say to you, I certainly get annoyed while people
    consider worries that they plainly do not know about. You managed to hit the
    nail upon the top as well as defined out the whole thing without having side
    effect , people can take a signal. Will likely be back to get more.

  19. Hi, Neat post. There is an issue with your web site in web explorer, could test
    this? IE still is the marketplace leader and a huge section of other
    folks will omit your magnificent writing due to this problem.

  20. I simply couldn’t leave your web site before suggesting that I
    actually enjoyed the standard information a person provide for your visitors?

    Is gonna be again continuously in order to check up on new posts

  21. Pretty nice post. I just stumbled upon your blog
    and wanted to say that I have really enjoyed browsing your blog posts.
    In any case I will be subscribing in your feed and I hope you write again very soon!

  22. I blog frequently and I genuinely thank you for your content.
    The article has truly peaked my interest. I will take a note of your site and keep checking
    for new details about once a week. I opted in for your Feed as well.

  23. I’ve been browsing online more than 3 hours lately, yet I by no means discovered any interesting article like yours.
    It is lovely price sufficient for me. In my opinion, if all webmasters and bloggers made good content material as you
    probably did, the internet can be a lot more helpful than ever before.

  24. Heya! I just wanted to ask if you ever have any trouble with hackers?

    My last blog (wordpress) was hacked and I ended up losing months of hard work due to no backup.
    Do you have any methods to prevent hackers?

  25. I’m extremely impressed with your writing skills as well as with the layout on your weblog.
    Is this a paid theme or did you modify it yourself?
    Anyway keep up the nice quality writing, it’s rare to see a great blog like this one nowadays.

  26. Howdy! Quick question that’s totally off topic.

    Do you know how to make your site mobile friendly?
    My site looks weird when viewing from my apple iphone. I’m trying to find
    a theme or plugin that might be able to resolve this
    issue. If you have any suggestions, please share.


  27. We are a gaggle of volunteers and starting a new scheme in our community.
    Your web site offered us with helpful info to work on. You have done an impressive process and our entire community will likely be thankful to you.

  28. [url=]tenormin online[/url] [url=]tadalafil 3mg[/url] [url=]cialis 20 mg pill[/url] [url=]accutane 20 mg online[/url] [url=]where to buy generic tadalafil[/url] [url=]cost of tadalafil in india[/url] [url=]buy trental[/url] [url=]amoxicillin 500mg without prescription[/url] [url=]ivermectin tablet price[/url] [url=]singulair medicine cost[/url]

  29. [url=]sildenafil 20 mg online no prescription[/url] [url=]advair 500 50 mg[/url] [url=]generic valtrex sale[/url] [url=]plavix without prescription[/url] [url=]tadacip[/url]

  30. It’s actually very complex in this busy life to listen news on TV, therefore I simply use world wide web for that reason, and take the latest information.

  31. [url=]cialis on line[/url] [url=]female viagra medication[/url] [url=]stromectol otc[/url] [url=]ivermectin 10 ml[/url] [url=]cialis 30 mg price[/url]

  32. Hi, i think that i saw you visited my site so i came to “return the favor”.I’m
    trying to find things to improve my site!I suppose its ok
    to use a few of your ideas!!

  33. Good day very nice website!! Man .. Beautiful ..

    Superb .. I will bookmark your blog and take the
    feeds additionally? I am glad to search out so many helpful information right here
    in the put up, we want develop extra strategies on this regard, thanks for sharing.
    . . . . .

  34. Having read this I thought it was really informative. I appreciate you spending some time and energy
    to put this information together. I once again find myself personally spending a significant amount of time both reading and leaving comments.
    But so what, it was still worthwhile!

  35. Unquestionably believe that which you stated.
    Your favorite justification seemed to be on the web the easiest thing to be aware of.
    I say to you, I certainly get annoyed while people consider
    worries that they plainly do not know about.
    You managed to hit the nail upon the top and also defined out the whole
    thing without having side effect , people
    could take a signal. Will likely be back to get more.

  36. My programmer is trying to convince me to move to .net from PHP.

    I have always disliked the idea because of the expenses.
    But he’s tryiong none the less. I’ve been using Movable-type
    on several websites for about a year and am concerned about switching to
    another platform. I have heard fantastic things
    about Is there a way I can import all my wordpress content into it?

    Any help would be greatly appreciated!

  37. Nice post. I used to be checking constantly this weblog and I’m impressed!
    Very useful info particularly the last phase 🙂 I handle such information much.
    I was seeking this certain information for
    a very lengthy time. Thanks and good luck.

  38. [url=]tadalafil buy online india[/url] [url=]150 mg sildenafil[/url] [url=]cialis price comparison[/url] [url=]sildenafil in canada[/url] [url=]tadalafil cost uk[/url] [url=]can i buy priligy over the counter[/url] [url=]cheap cialis online canada[/url] [url=]sildenafil for sale[/url] [url=]can you purchase viagra over the counter in mexico[/url] [url=]stromectol covid[/url]

  39. Pretty section of content. I just stumbled upon your web
    site and in accession capital to assert that I acquire
    actually enjoyed account your blog posts. Anyway I’ll be subscribing to your feeds and even I achievement you access consistently fast.

  40. I don’t know if it’s just me or if everyone else encountering issues with
    your website. It seems like some of the text in your content are running off the screen. Can somebody else please provide feedback and let me know if this is
    happening to them too? This might be a problem
    with my web browser because I’ve had this happen previously.

  41. [url=]stromectol buy uk[/url] [url=]discount generic cialis 20mg[/url] [url=]sildenafil over counter[/url] [url=]generic sildenafil us[/url] [url=]stromectol buy[/url]

  42. This design is incredible! You definitely know how to keep a reader
    amused. Between your wit and your videos, I was almost moved to start my own blog (well, almost…HaHa!) Fantastic job.

    I really enjoyed what you had to say, and more than that, how you
    presented it. Too cool!