PDF microservice decoupling and data duplication

I’m working on a microservice that other microservices can request a PDF file from. Here is the flow I am considering:

  • User calls the PDF generation endpoint (GraphQL mutation) on the microservice that needs the PDF, like Sales microservice. That endpoint dispatches a CreateInvoicePDFCommand or CreateQuotePDFCommand containing the related aggregate id.

  • Seperately, the client subscribes to updates via GraphQL subscription (using an Axon Query Bus subscription) to get notified when the PDF is ready.

  • Use a saga to coordinate the flow in PDF microservice? Listens for InvoicePDFRequestedEvent, dispatches a command to generate the PDF.

  • Read model in PDF microservice is updated and uses the QueryUpdateEmitter to update the client subscription with either byte[] containing the PDF or S3 url.

Is my flow correct, and how does the PDF microservice obtain the data it uses to generate PDFs? I’ve read that one approach is to set up a view that duplicates data (like invoices, quotes, etc) from the sales microservice and other related services into the PDF microservice. I’m worried that this will cause the PDF microservice database to become enormous over time.

Thank you in advance