The Wayback Machine - https://web.archive.org/web/20220415092721/https://github.com/gatsbyjs/gatsby/pull/33161
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gatsby-core-utils): add fetch mutex for PQR #33161

Merged
merged 4 commits into from Sep 15, 2021
Merged

Conversation

Copy link
Member

@wardpeet wardpeet commented Sep 13, 2021

Description

When multiple workers fetch the same file we end up with multiple requests to that file. We write to a seperate tmp file but the end result writes (fs.move) to the same file.
This can lead to corrupt files if other parts of the application are resolving it. This PR creates some hacky mutex implementation to make it only write once. It's still possible to write multiple tmp files.

  1. Add global build id so mutex can be cleared per build.
  2. When fetch is in progress save metadata into cache with worker id & build id
  3. When fetch is complete and we're going to write - check cache again and only continue with worker from metadata. Other workers poll end result.

As an adittion I took the inFlight cache from #30391 and implemented it as well to do less polling.

@gatsbot gatsbot bot added the status: triage needed label Sep 13, 2021
@wardpeet wardpeet added gatsby 4 topic: data and removed status: triage needed labels Sep 14, 2021
@wardpeet wardpeet marked this pull request as ready for review Sep 14, 2021
if (entry.status === `complete`) {
cb(entry.result)
} else {
setTimeout(() => {
pollUntilComplete(cache, url, buildId, cb)
// Magic number
}, 500)
}
Copy link
Contributor

@vladar vladar Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if a worker starts polling but then the "main" worker fails (i.e. requestRemoteNode throws)? I don't see how it can exist from this function in this case. Can we have a test case for it as it is a typical scenario?

Copy link
Member Author

@wardpeet wardpeet Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We keep on hanging 😂 good point but wouldn't the process fail anyway?

Copy link
Contributor

@vladar vladar Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on how folks use it. If it happens in custom resolver - the corresponding field can just return null (graphql allows partial errors for nullable fields)

vladar
vladar previously approved these changes Sep 14, 2021
Copy link
Contributor

@vladar vladar left a comment

Thank you 👍

@wardpeet wardpeet merged commit aa050d7 into master Sep 15, 2021
26 of 31 checks passed
@wardpeet wardpeet deleted the feat/add-fetch-mutex branch Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants