Firestore Many-to-Many

Firestore Many-to-Many

12 min read

Not all many-to-many situations are possible nor impossible in Firestore. I figured I would try and list all of them I can think of with their limitations. There are 3 common examples: - Customers and Products - Classes and Students - Followers and Following They each also have different use cases that can dramatically change the limitations. **Note:** All of these examples use angular firestore in typescript, but the data modeling and rxjs usage is the same in other languages. Let's start simple: ## Classes and Students The beauty of this example, is that a there are a limited number of students a class can have, and a limited number a classes a student can take. Even if that number were as high as 1000, it will never be as high as 10,000, which is theoretically the most amount of information you want to have in one [Firestore Document](https://dev.to/jdgamble555/how-to-build-a-scalable-follower-feed-in-firestore-25oj). ### Model ```typescript Classes / ClassID: { data... students: [ studentID1, studentID2, ... ] } Students / StudentID: { data... classes: [ classID1, classID2 ] } ``` You honestly don't need both arrays, and can choose either or, but you do need both collections. However, as you will see, it is much easier to query when you use both. I suggest you do use both, and use [batch](https://firebase.google.com/docs/firestore/manage-data/transactions#batched-writes) to add and update both. That way you can always query in the cleanest way. šŸ—Š **Note:** Angular uses a more complex **ref** for queries, so I simplified them in the examples below that use **db** instead of **this.afs**. See [here](https://github.com/angular/angularfire/blob/master/docs/firestore/querying-collections.md) for Angular usage. ### Add #### A student takes a class ```typescript this.afs.doc('classes/' + classID).update({ students: firebase.firestore.FieldValue.arrayUnion(studentID) }); ``` OR ```typescript this.afs.doc('students/' + studentID).update({ classes: firebase.firestore.FieldValue.arrayUnion(classID) }); ``` ### Update #### A student drops a class ```typescript this.afs.doc(`classes/${classID}`).update({ students: firebase.firestore.FieldValue.arrayRemove(studentID) }); ``` OR ```typescript this.afs.doc(`students/${studentID}`).update({ classes: firebase.firestore.FieldValue.arrayRemove(classID) }); ``` #### Batch ```typescript const batch = this.afs.firestore.batch(); const studentID = 'you-student-doc-id'; const classID = 'your-class-doc-id'; const studentRef = this.afs.doc(`students/${studentID}`).ref; batch.set(studentRef, { classes: firebase.firestore.FieldValue.arrayUnion(classID) }); const classRef = this.afs.doc(`classes/${classID}`).ref; batch.set(classRef, { students: firebase.firestore.FieldValue.arrayUnion(studentID) }); await batch.commit(); ``` ### Query #### Get all classes a student is taking ```typescript db.collection('classes') .where('students', 'array-contains', studentID); ``` OR ```typescript this.afs.doc(`students/${studentID}`).valueChanges().pipe( switchMap((r: any) => { const docs: Observable[] = r.classes.map( (id: any) => this.afs.doc(`classes/${id}`).valueChanges() ); return combineLatest(docs); }) ); ``` You have a list of classes in the **classes** array, then you grab the documents one-by-one. In this case, you're getting one more read. #### Get all students taking a class ```typescript db.collection('students') .where('classes', 'array-contains', classID); ``` OR ```typescript this.afs.doc(`classes/${classID}`).valueChanges().pipe( switchMap((r: any) => { const docs: Observable[] = r.students.map( (id: any) => this.afs.doc(`students/${id}`).valueChanges() ); return combineLatest(docs); }) ); ``` Like above, you have a list of students in the **students** array, then you grab the documents one-by-one. In this case, you're getting one more read as well. So, while you have 2 different ways to add, update, and query, you don't necessarily need to keep both arrays up-to-date at all times, but you will have to be creative in your queries. If you want simpler queries in all cases, keep both arrays up-to-date by using **batch** when adding to the database. #### Multiple Where Clauses You can easily add something like: ```typescript db.collection('classes') .where('students', 'array-contains', studentID) .where('status', '==', 'active'); ``` if you wanted to get all active students. The default sorts will be by ID without `orderBy()`. ### Sorting Once you want to sort the results, you need to create an index. This can be one step more complicated. By [clicking the link](https://firebase.google.com/docs/firestore/query-data/indexing) in the console the index will be built automatically. #### Get all students taking a class sorted by their name ```typescript db.collection('classes') .where('students', 'array-contains', studentID) .orderBy('name'); ``` This requires an index. This kind of index is not bad, as you know the name of the students array and the students' name field. **Note**: You will need to create an index for EACH where clause you add to this query. ```typescript db.collection('classes') .where('students', 'array-contains', studentID) .where('status', '==', 'active') .orderBy('name'); ``` OR Here you avoid the index. The frontend is not as clean, but in this example just add another map field after switchMap: **Add a map to sort** ```typescript // sort by name map((s: any) => s.sort((a: any, b: any) => { const f = 'name'; if (a[f] < b[f]) { return -1; } if (b[f] < a[f]) { return 1; } return 0; })) ``` After the **switchMap** ```typescript this.afs.doc(`students/${studentID}`).valueChanges().pipe( switchMap((r: any) => { const docs: Observable[] = r.classes.map( (id: any) => this.afs.doc(`classes/${id}`).valueChanges() ); return combineLatest(docs); }), // sort by name map((s: any) => s.sort((a: any, b: any) => { const f = 'name'; if (a[f] < b[f]) { return -1; } if (b[f] < a[f]) { return 1; } return 0; })) ); ``` You see the pattern here, which would be the same for **getting all classes a student is taking sorted by the class name** #### Where Clause on Frontend Joins It is tempting to think you should just get all the documents, then filter them after like so: ```typescript map((a: any[]) => a.filter((f: any) => f.status === 'active')) ``` In context: ```typescript this.afs.doc('students/' + studentID).valueChanges().pipe( switchMap((r: any) => { const docs: Observable[] = r.classes.map( (id: any) => this.afs.doc('classes/' + id).valueChanges() ); return combineLatest(docs); }), map((a: any[]) => a.filter((f: any) => f.status === 'active')) ); ``` While this technically works, it gives you MORE reads than you need. You should use a where clause on the docs, filter out the undefined results, then reduce the top array. This will only read the documents that matches the where clause, saving you reads quickly. ```typescript this.afs.doc('students/' + studentID).valueChanges().pipe( switchMap((r: any) => { const docs: Observable[] = r.classes.map( (id: any) => this.afs.collection('classes', ref => ref .where(firebase.firestore.FieldPath.documentId(), '==', id) .where('status', '==', 'active') ).valueChanges() ); return combineLatest(docs); }), map((arr: any[]) => arr .filter((f: any) => f && f[0]) .map((m: any[]) => m[0]) ) ); ``` So as you can see, it gets arduous to use all these rxjs joins. Keep both arrays at all times, and you should be able to query from either direction witout having to use these joins. However, you need to know how to use them anyway. I will make this a Series. In the next post, I will talk about the benefits of using a **map** type instead of an **array** type in Firestore. I will eventually get to complex cases for scaling issues. Let me know if I missed something. J
manytomany
rxjs
datamodeling