The Wayback Machine - https://web.archive.org/web/20211201235616/https://github.com/gatsbyjs/gatsby/discussions/27889
Skip to content

Clarification on trailing slashes, urls in sitemaps and canonical urls? #27889

Answered by jon-sully
alexpchin asked this question in Help
Clarification on trailing slashes, urls in sitemaps and canonical urls? #27889
Nov 7, 2020 · 7 answers · 15 replies

I am currently working on a gatsby website and I've been struggling to find the "best practise" and implementation for all things related to trailing slashes:

  • "to use or not use" trailing slashes
  • trailing slashes in URLs in sitemaps
  • trailing slashes in canonical URLs
  • confusion over gatsby-plugin-remove-trailing-slashes and gatsby-plugin-force-trailing-slashes

After researching, it seems other people are confused as well and I don't feel that the documentation is clear as to what the best practices are and how to go about achieving them. It seems there is an unnecessary amount of out-of-the box learning when it comes to best setup. Also, which bits require server configuration and which bits do not.

Here are my questions:

1. If gatsby pages are generated with a directory structure about/index.html, so should the URLs have a trailing slash?

According to:

If the resource you are linking to is a directory then it should have a trailing slash. As Gatsby, by default creates directories for URLs, then they should include the trailiing slash?

2. Should users setup 301 redirects from no trailing slash to trailing slash?

Accordinig to the previous links, the non-trailing slash URL should have a 301 redirect to the trailinig slash URL.

When setting up, it seemed like the plugin gatsby-plugin-force-trailing-slashes would do this for you however it doesn't do the 301 redirect which is a server setup problem.

In this project, I am using Cloudfront, so used this function to redirect from no trailing slash to trailiing slash. This is not a trivial thing for someone starting out to work out?

// https://gist.github.com/karolmajta/6aad229b415be43f5e0ec519b144c26e

'use strict';

const pointsToFile = uri => /\/[^/]+\.[^/]+$/.test(uri);

exports.handler = (event, context, callback) => {
    
    // Extract the request from the CloudFront event that is sent to Lambda@Edge 
    var request = event.Records[0].cf.request;

    // Extract the URI from the request
    var oldUri = request.uri;
    var newUri;
    
    if (!pointsToFile(oldUri) && !oldUri.endsWith('/')) {
      const newUri = request.querystring ? `${oldUri}/?${request.querystring}` : `${oldUri}/`;
      return callback(null, {
        body: '',
        status: '301',
        statusDescription: 'Moved Permanently',
        headers: {
          location: [{
            key: 'Location',
            value: newUri,
          }],
        }
      });
    } else {
      newUri = oldUri;
    }

    // Match any '/' that occurs at the end of a URI. Replace it with a default index
    newUri = newUri.replace(/\/$/, '\/index.html');
    
    // Log the URI as received by CloudFront and the new URI to be used to fetch from origin
    console.log("Old URI: " + oldUri);
    console.log("New URI: " + newUri);
    
    // Replace the received URI with the URI that includes the index page
    request.uri = newUri;
    
    // Return to CloudFront
    return callback(null, request);

};

I know on Netlify there are other options to do tthis.

Note: Alternatively gatsby-plugin-create-page-html is a plugin that creates an about.html file from an about/index.html file. However, this just means you have two resources for the same content, which I believe isn't good.

3. Your pages should include a canoncial URL that includes the trailing space

I tried both gatsby-plugin-canonical-urls and gatsby-plugin-react-helmet-canonical-urls to add canonic URLs to my pages. In the documentation for gatsby-plugin-canonical-urls it says:

With the above configuration, the plugin will add to the head of every HTML page a rel=canonical e.g.

<link rel="canonical" href="https://www.example.com/about-us/" />

This show that the URL should have a trailing slash.

FYI: I added patch for gatsby-plugin-react-hemet-canonical-urls to force an option to force a trailing slash:

diff --git a/node_modules/gatsby-plugin-react-helmet-canonical-urls/wrap-page.js b/node_modules/gatsby-plugin-react-helmet-canonical-urls/wrap-page.js
index ea5cc53..37ee5a0 100644
--- a/node_modules/gatsby-plugin-react-helmet-canonical-urls/wrap-page.js
+++ b/node_modules/gatsby-plugin-react-helmet-canonical-urls/wrap-page.js
@@ -7,6 +7,7 @@ var _require = require('react-helmet'),
 
 var defaultPluginOptions = {
   noTrailingSlash: false,
+  trailingSlash: false,
   nopQueryString: false,
   nopHash: false
 };
@@ -33,6 +34,7 @@ module.exports = function (_ref, pluginOptions) {
   if (options.siteUrl && !isExcluded(options.exclude, location.pathname)) {
     var pathname = location.pathname || '/';
     if (options.noTrailingSlash && pathname.endsWith('/')) pathname = pathname.substring(0, pathname.length - 1);
+    if (options.trailingSlash && !pathname.endsWith('/')) pathname = pathname.replace(/\/?$/, '/');
     var myUrl = "" + options.siteUrl + pathname;
     if (!options.noQueryString) myUrl += location.search;
     if (!options.noHash) myUrl += location.hash;

I know these are more of a collection of thoughts but I feel like there are lots of people asking questions related to these issues so the documentation could perhaps be more direct at addressing the solutions?

Replies

7 suggested answers
·
15 replies

LekoArts
Nov 10, 2020
Maintainer

Hi, thanks for the issue!

What documentation did you find confusing? Can you directly link to it and highlight the portions of it?

Gatsby by default creates paths with a trailing slash, everything after that is your personal preference and I don't think we make any recommendations or comments about this. Moreover the documentation should be specific to Gatsby so I don't see the need for explanations about server redirects or SEO implications as there are enough articles on the internet already.

The example you cited from the canonical URL plugin just happens to have the trailing slash as that's what Gatsby is creating by default. It doesn't imply that canonical URLs have to have a trailing slash.

I'd recommend sticking to one of the two options (non-trailing/trailing) and keeping it consistent everywhere.

1 reply
@AleksandrHovhannisyan

Hmm, I'm working on migrating my site to Gatsby, and locally, I see both trailing and non-trailing variants for all pages. How do I enforce one or the other? I'm on Gatsby v2.26.1.

@alexpchin - this is an awesome summarization of a lot of confusing points in the workflow. Kudos for putting all that together 😁 this is something I've been doing a lot of digging on in the last couple of days and I agree, there's no clear prescriptions (which is fine, "let the people choose") but there's no real guidance in any given direction either.


EDIT April 2021: This post ended up helping a lot of folks so do feel free to give it a read, but if you want to know more and explore the why/how behind the workflow, I explored this further on my blog: https://jonsully.net/blog/trailing-slashes-and-gatsby/


Here's what we know for certain:

  • Following traditional web-server parlance, Gatsby generates the static files to be consumed with a trailing slash (./file-name/index.html)
  • Serving duplicate content at both a slashed and un-slashed path for the same thing is bad for SEO and one's own analytics
  • The client-side routing via the @reach/router once the React app has hydrated is fully separate from the server-side "made-for-trailing-slashes" routing that Gatsby set the stage for
    • Further to this point, the @reach/router doesn't actually care whether you use trailing-slash or not during navigations - it treats them the same
    • Reach Router made a choice to ignore trailing slashes completely, therefore eliminating any difference when navigating up the path.

So how to we reconcile all of that? Here's what my opinion would be this. Step 0 is to make the decision to unify your site with or without the trailing slash. The goal is to have uniformity and non-duplicated content, that's all. Gatsby generates static files in preparation for a trailing slash. I don't like fighting my tools. My vote is go ahead and use the trailing slash. So, to do that:

First

Make sure your web server of choice is following standard web-server parlance.

  • It should be serving directories as trailing-slash paths
    • Example: file produced as ./public/blog/index.html should be served at example.com/blog/
    • Requests to the server for the path /blog should be redirected to /blog/ since it's a directory, not a document
  • It should be serving documents as non-trailing-slash paths
    • Example: file produced as ./public/404.html should be served at example.com/404
    • Requests to the server for the path /404/ should be redirect to /404 since it's a document, not a directory

Sidenote: I'm a Netlify guy and Netlify's platform supports a feature called Pretty URLs, which enforces all of the above for you. Make sure it's enabled (it is by default, but make some cURL requests to your site to double-ensure). See here for more info.

I can't speak for other web hosting platforms, but this "holds to standard web parlance" policy should really be the standard... it's been the standard for web for 20+ years. It should require additional configuration to not do this. You can test your host's behavior yourself though, just make a quick site that's basically an ./index.html, ./foo.html, and ./bar/index.html then run some cURL requests and see what the host is sending back as far as redirects go. Tailor as needed.

Second

Explicitly use the trailing slash in all usages of the <Link> component and/or invocations of navigate()

This ensures that once the PWA hydrates and the @reach/router takes over, the address shown in the browser address bar will always have a trailing slash. Since the @reach/router doesn't enforce any rules around trailing slashes, failing to do this for all <Link>s and navigate()s will leave trailing slashes off and prevent uniformity. Especially if a user copies the address from their browser and shares it with someone else. Uniformity is the goal here.

This + the next step will also prevent the odd 'trailing-slash-flash' that can occur when you reload a page that did / didn't have a trailing slash and it flips to the other way.

Third

Install gatsby-plugin-force-trailing-slashes. Even though Gatsby generates the static html pages within their own named directories by default, this plugin forces the path value for each page to end in a / - critical for configuring the core Gatsby @reach/router at build time. This prevents the case where you load a trailing-slash page and once the PWA hydrates it drops the slash; or when you re-load a non-trailing slash page and the dreaded no-slash -> slash -> no-slash sequence occurs. That's a result of the @reach/router expecting non-trailing slashes from its build-time configuration. This plugin fixes it. Here's the default config I use:

{
  resolve: `gatsby-plugin-force-trailing-slashes`,
  options: {
  excludedPaths: [`/404.html`],
},

That's it. If you can nail those three steps, your site should be exclusively running in a trailing-slash-only, SEO-friendly, link-sharing-friendly format. Deploy it, click around, refresh to your heart's content, hit it from cURL; all of your pathing should be uniform.


The reality is that Gatsby sets us up by default for trailing slashes, but doesn't control any server-level settings and the @reach/router ... well it doesn't care, works both ways, and is at the mercy of build-time configuration. So when @LekoArts says

everything after that is your personal preference and I don't think we make any recommendations or comments about this

That's true, but it doesn't necessarily leave Gatsby users feeling the most satisfied. Gatsby can work without trailing slashes and with.. but you have to know the layers (including the server layer) well to set that up. I hope this provides some basis for how to do that 🙂

12 replies
@imnotdannorton

@jon-sully Wow, excellent post and I think I'll go with the slashes route -- a quick attempt at generating the html files did yield them output correctly, but my non-trailing slash routes stopped working (routing to the .html file directly did work).

Thinking a few steps forward, if I have Gatsby generating sitemaps and canonicals based on the paths provided by createPage() there'd be all sorts of references to those .html paths, just as the router keeps them -- even though Netlify would be serving them without the slash as documents server side. At the end of the day, it just seems like much less of a headache to keep things consistent with slashes if I want Gatsby to be in line with server behavior vs trying to 'trick' the Gatsby router and other plugins that would rely on paths sent to createPage()

Thanks again!

@jon-sully

Quite welcome. And yes, at it's current iteration, I'd agree — Gatsby works better, smoother, cleaner (etc.) with trailing slashes, following the 3-step approach at the bottom of my article 👍🏻

What we could request to see from Gatsby as far as functionality goes is instead to change the behavior of the passed-in path string — if the path string ends in a slash, directory-ize the resulting HTML file. If it does not, just make it a document, don't directory-ize it. I think that would be a reasonable and effective step forward for folks since the same path string getting passed to the Reach router would match the type of underlying file structure Gatsby would make.

Unfortunately that's a tough change to make for Gatsby because it would break a lot of existing sites' trailing-slash paradigm as soon as they upgrade (with potentially very-bad effects on resulting SEO... but maybe they can make it as a plugin and have it installed on fresh Gatsby sites but leave the functionality as-is for existing sites. Just spitballing here. 😅 But I will tag @pragmaticpat in case they want to spark any thoughts! 👍🏻

@stijnvanlieshout

@jon-sully Thanks for the solution. You suggest to add trailing slashes to all routes fed to Link and navigate reach router components. But what about Reach router's Redirect or useRedirect component? Both strip the trailing slash away. How do you advise dealing with that?

@jon-sully

Interesting. To be honest, I haven't played with or looked at the Redirect / useRedirect workflow at all with the RR. When it comes to redirecting from old URLs that are now considered 'dead', I use Netlify's actual ('back end' / 'server') redirects. When it comes to pushing people around to correct spots in the Gatsby PWA once it hydrates, I've always just used the navigate function.

Here's an example from a large pure-React integration I wrote for Netlify Identity. This is taken from the login form. Ignoring what navigateTarget is (it's just an optional string) — the idea is that if you're already logged in, you shouldn't ever see the Login form. So when identity.user is present, navigate is called to push the user somewhere else.

https://github.com/jon-sully/gatsby-plugin-netlify-identity-gotrue-demo/blob/c4b6dfb42197bdb18f908a5fbab4e06e2dd3bfed/src/components/LoginForm.jsx#L18

  useEffect(() => {
    navigateTarget && identity.user && navigate(navigateTarget)
  }, [navigateTarget, identity.user])

To me this is a front-end redirect. Just the same as Redirect would be. If the goal is a back-end / server-level redirect, I wouldn't do that in Gatsby/React at all, but that implementation would be up to your hosting provider. Does that help?

@stijnvanlieshout

@jon-sully thank you, that does help. In my project I use useRedirect to take care of this auth redirects. That felt semantically more correct, as after all a user is redirected without pressing any buttons etc.

It would have been easier if reach router also enforced trailing slashes for useRedirect but I'll start using navigate like you did instead.

Answer selected by LekoArts

Good catch ~!

0 replies

Hi!

We just had the same issue in our Gatsby project: Some URLs had trailing slashes, some not. After doing some search on Google, I landed on this discussion. We got the problem solved, here are our learnings:

  • We use Netlify, so all URLs should have trailing slashes. All other URL formats will end in a redirect.

To make sure all links have trailing slashes, we did the following:

  1. We created a common Link component that is used for all links on our page. This component makes sure to always render internal URLs with a trailing slash
  2. The sitemap plugin and canonical link plugin use the path you pass in your createPage. We went over all createPage calls and check if the passed in path has a traling slash

With those 2 fixed we got consistent URLs into our project, with every link ending with a traling slash. Hope that helps 🙂

Btw: We also wrote a blog post about our findings. Fell free to read, there are also links to our Link component implementation 😎
https://satellytes.com/blog/how-consistency-helps-you-to-optimize-gatsby-urls/

0 replies

There seems to be a documented method of using and forcing trailing slashes - is there anyone who opted to not use them?

I've been looking into this after noticing pages hitting 301 redirects. My entire project is fine without them, except for this minor detail. I believe it might be effecting SEO - especially when the canonical is pointing to the real source (rather than the non-trailing slash URL).

A couple things worth noting:

  1. All the links are pointing to the non-trailing slash URL as expected.
  2. All the pages are generated with non-trailing slash paths, es expected (and thus the sitemap looks good).
  3. It's easy to use the plugin and remove trailing slashes on Gatsby generated pages.

I've setup a fork of gatsby-starter-blog that makes an attempt at removing trailing slashes and produces the 301 redirect as seen below via gatsby serve.

https://github.com/laneparton/trailing-slash-test

Screen_Shot_2021-09-27_at_10 43 23_AM

I've also noticed how Gatsbyjs.com handles this. Using the test repo linked above on Gatsby Cloud, there is no 301 redirect. So there must be something on the host side that fixes this?

https://trailingslashtestmaster.gatsbyjs.io/new-beginnings

Screen_Shot_2021-09-27_at_10 43 56_AM

2 replies
@willydavid1

Did you manage to solve it? right now I have the same problem. I need status 200

@laneparton

Sadly, I didn't. I think we're going to go ahead and move forward with the trailing slash method as described above rather than try to fight this battle 😆

Has anybody found a way to only serve from non-trailing slashes? @jon-sully

This is important to us due to complexities with our monorepo and using rewrites to proxy non-gatsby URLs.

If I add something like this in vercel.json

  "redirects": [
    {
      "source": "/:path*/",
      "destination": "/:path*",
      "statusCode": 302
    },

I end up with the redirect stripping the trailing slash, but then gatsby will still 301 and permanently redirect to the trailing slash.

Screen Shot 2021-11-24 at 12 38 30 PM

This causes an infinite redirect, and is unusable for us. Anybody have any advice?

If not, it seems we will be forced to move away from gatsby and only use next.js. Its important for our SEO to serve from only one route, and for us, we must use a route that does not end in a trailing-slash.

0 replies

So I've written somewhat extensively about using trailing slashes and it's the approach I (fairly strongly) encourage but if you really prefer using non-trailing slashes I believe it can be done. I'll build a PoC to ensure I know what I'm talking about then post back here 🙂

0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Help
Labels
None yet
Converted from issue