When should I use bulk query? Read on and let's analyse it together.
This blog builds upon my previous blog, Consuming Shopify Product API Example. We'll compare the bulk query with the non-bulk query we did on that example.
Shopify Bulk Query Changes
These are the changes we made to the previous example so we can do Shopify bulk query.
main.go
// snipped...
func main() {
config, err := config.Load("./conf/config.json")
if err != nil {
panic(err)
}
products := service.BulkQuery(config)
r := chi.NewRouter()
r.Use(middleware.Logger)
r.Get("/", router.GetStatus)
r.Mount("/products", router.GetProducts(config))
r.Mount("/cached-products", router.GetCachedProducts(config, products))
http.ListenAndServe(fmt.Sprintf(":%v", config.Port), r)
}
Just a few changes here. We created a new package named service
to handle the bulk query operations and a new path to hit to pull the cached products. Quick and easy.
model.go
package service
type Node struct {
ID string `json:"id,omitempty"`
Title string `json:"title,omitempty"`
Handle string `json:"handle,omitempty"`
Vendor string `json:"vendor,omitempty"`
ProductType string `json:"producType,omitempty"`
Tags []string `json:"tags,omitempty"`
Namespace string `json:"namespace,omitempty"`
Key string `json:"key,omitempty"`
Value string `json:"value,omitempty"`
ParentID string `json:"__parentId,omitempty"`
}
type Product struct {
ID string `json:"id,omitempty"`
Title string `json:"title,omitempty"`
Handle string `json:"handle,omitempty"`
Vendor string `json:"vendor,omitempty"`
ProductType string `json:"producType,omitempty"`
Tags []string `json:"tags,omitempty"`
Metafields []Metafield `json:"metafields,omitempty"`
}
type Metafield struct {
Namespace string `json:"namespace,omitempty"`
Key string `json:"key,omitempty"`
Value string `json:"value,omitempty"`
ParentID string `json:"__parentId,omitempty"`
}
Here we model the data types we use.
products.go
// snipped...
func GetCachedProducts(config config.Config, products []service.Product) chi.Router {
router := chi.NewRouter()
router.Get("/", func(w http.ResponseWriter, r *http.Request) {
jsonBytes := marshaller.Marshal(products)
w.Header().Set(contentType, applicationJson)
w.WriteHeader(200)
w.Write(jsonBytes)
})
return router
}
A function that handles requests to the new /cached-products
path.
bulk-query.go
// snipped...
func BulkQuery(config config.Config) []Product {
fmt.Println("++ bulk query")
bulkQueryGql := fmt.Sprintf(`
mutation {
bulkOperationRunQuery(
// ... snipped ...
}
}
`)
query := GqlQuery{
Query: bulkQueryGql,
}
client := &http.Client{}
responseBody, err := sendRequest(client, query, config)
if err != nil {
panic(err)
}
gqlResp := marshaller.Unmarshal[GqlResponse](responseBody)
if gqlResp.Data.BulkOperationRunQuery.BulkOperation.Status == "CREATED" {
fmt.Println("Created at: ", gqlResp.Data.BulkOperationRunQuery.BulkOperation.CreatedAt)
currentOperationQueryGql := fmt.Sprintf(`
query CurrentBulkOperation {
currentBulkOperation {
completedAt
createdAt
errorCode
fileSize
id
objectCount
status
url
}
}
`)
query = GqlQuery{
Query: currentOperationQueryGql,
}
for {
time.Sleep(time.Second * 2)
responseBody, err := sendRequest(client, query, config)
if err != nil {
panic(err)
}
gqlResp = marshaller.Unmarshal[GqlResponse](responseBody)
if gqlResp.Data.CurrentBulkOperation.Status == "CANCELED" ||
gqlResp.Data.CurrentBulkOperation.Status == "CANCELING" ||
gqlResp.Data.CurrentBulkOperation.Status == "EXPIRED" ||
gqlResp.Data.CurrentBulkOperation.Status == "FAILED" {
fmt.Println("Status: ", gqlResp.Data.CurrentBulkOperation.CreatedAt)
break
}
if gqlResp.Data.CurrentBulkOperation.Status == "COMPLETED" {
fmt.Println("URL: ", gqlResp.Data.CurrentBulkOperation.URL)
productFile, err := downloadFile("products.tmp", gqlResp.Data.CurrentBulkOperation.URL)
if err != nil {
break
}
return parseProductsFile(productFile)
}
}
}
return make([]Product, 0)
}
// snipped...
This is where all the magic happens. We issue a bulk query request via mutation operation. We then unmarshall the response and check the bulk operation status if it has been created. If it was created, we then poll the current bulk operation until it is completed, canceled, failed, etc. Once it is completed, we download it and save it into a temporary file. This file will be in JSONL (JSON Lines) format, then we will have to parse the file in order to build the product tree.
There is also a webhook way of checking the bulk operation status. It is recommended over polling as it limits the number of redundant API calls. But for the purposes of this example, we'll do polling.
For more details about Shopify bulk operations, go to Perform bulk operations with the GraphQL Admin API. Go to Bulk Operation Status to learn more valid status values.
The JSONL file would look something like below:
{"id":"gid:\/\/shopify\/Product\/8787070189860","title":"The Videographer Snowboard","handle":"the-videographer-snowboard","vendor":"Quickstart (5cec88e7)","productType":"","tags":[]}
{"id":"gid:\/\/shopify\/Product\/8787070222628","title":"The Minimal Snowboard","handle":"the-minimal-snowboard","vendor":"Quickstart (5cec88e7)","productType":"","tags":[]}
{"id":"gid:\/\/shopify\/Product\/8787070386468","title":"The Archived Snowboard","handle":"the-archived-snowboard","vendor":"Snowboard Vendor","productType":"","tags":["Archived","Premium","Snow","Snowboard","Sport","Winter"]}
Running the Shopify Bulk Query Example
On start up, you should see somethingl like below. Shopify has returned with a download URL. Which means the bulk query has completed and we are ready to serve the cached products.
Comparison with Non-Bulk Query
Now let's compare the new way of pulling data to the old way.
Can you spot the difference? In terms of speed, the response time of the new way was clearly super fast. Just 4ms compared to 279ms, imagine that? What's more is that not only did the old way take longer, it returned less data. It return 639 B compared to 4.65 KB. In other words, in the old way we only received 3 products while in the new way we received all products including metafields. That's an icing on the cake. As for start up time of the app, it was negligible.
Shopify Bulk Query Wrap Up
Would you do bulk query now or not? It is up to you to identify a potential bulk query. Queries that use pagination to get all pages of results are the most common candidates. There are limitations on a bulk query though. For example, you can't pull data that's nested two levels deep. Check the Shopify documentation for more information.
There you have it. Another way to pull product data from the Shopify GraphQL Admin API. If you got a better way of doing things (e.g. how to parse the JSONL better), just raise a pull request. Happy to look at it. Grab the repo here, github.com/jpllosa/shopify-product-api, it's on the bulk-query branch.