Android Firebase how to handle real time server to local database connection

The onChildAdded listener gets called enormous amounts of times for every child on this root.

As you already mentioned and as the docs states, this is the expected behaviour. Usually, is not recommended to attach a ChildEventListener on a node (root node) that contains huge amount of data. Please be careful about this practice because when downloading large amount of data, you can get erros like: OutOfMemoryError. This is happening because you implicitly download the entire node that you are listening to, along with all the data beneath it. That data might be present as simple properties or, as complex objects. So it can be considered a waste of resource and bandwidth. In this case, the best approach is to flatten the database as much as possible. If you are new to NoSQL databases, this practice is called denormalization and is a common practice when it comes to Firebase. For a better understanding, I recommend you take a look at:

  • This video, Denormalization is normal with the Firebase Database.
  • Official docs regarding Best practices for data structure in Firebase realtime database.
  • My answer from this post: What is denormalization in Firebase Cloud Firestore?
  • This article, Structuring your Firebase Data correctly for a Complex App.
  • This article, NoSQL data modeling techniques.

Please also note that when you are duplicating data, there is one thing that need to keep in mind. In the same way you are adding data, you need to maintain it. With other words, if you want to update/detele an item, you need to do it in every place that it exists.

I also recommend you to see the last part of my answer from the following post:

  • What is the correct way to structure this kind of data in firestore?

It is for Cloud Firestore but same rules apply to Firebase realtime database.

But then I have lost my CRUD capabilities because it's listening to the new entries and not all of them.

Everything in Firebase is about listeners. You cannot get realtime updates for objects within a node, unless you are listening to them. So you cannot limit the results and expect to get updates from objects that you are not listening to. If you need to get updates for all objects within a node, you need to listen to all of them. Because this approach isn't practical at all, you can either use denormalization as explained above or to restrict the results by using queries that can help you limit the amount of data that you get from the database. Regarding your solutions, the second one is much preferred but you can also consider another approach which would be to load data in smaller chunks according to a timestamp property, or according to any other property that you need.

Edit: According to your comment:

Can you please provide tests for each solution (1.denormalization, 2.my solution) examine use of bandwidth and resources and which one is really preferred?

All data is modeled to allow the use-cases that an app requires. Unfortunately, I cannot do tests because it really depends on the use-case of the app and the amount of data that it contains. This means that what works for one app, may be insufficient for another app. So the tests might not be correct for everyone. The denormalization process or your solution is entirely dependent on how you intend to query the database. In the list above, I have added a new resource which is an answer of mine regarding the denormalization tehnique in NoSQL databases. Hope it will also help feature visitors.