Instantly compare Screaming Frog crawl data with your XML sitemaps. Spot missing pages, fix orphaned content, and keep your site index clean.
Step #1: Upload XLSX File
After you run a carwl on Screaming Frog, get the list of internal linked URL export in the Excel format, which will be a. xlsx file. There are some configuration options here which are given below.
Only HTML files: This option filters out any non-HTML content from the uploaded file. If checked, only URLs that point to HTML pages (typically web pages) will be included, while other file types (like PDFs, scripts, images, etc.) will be excluded.
Only Indexable URLs: Selecting this option filters the URLs to include only those that are "indexable." This means URLs marked as non-indexable in the uploaded file (often pages blocked from search engines) will be excluded.
Only Clean URLs: When checked, this filter excludes any URLs that contain query parameters (like ? or #). This option is helpful if you want to avoid temporary or duplicate URLs that may have extra parameters for tracking, session IDs, or filtering.
Step #2: Upload Sitemap XML Files
Download the xml sitemap file and save it to your PC. If you have multiple sitemap files you can download each of them which has the URLs and then upload those multiple files in the Upload Sitemap XML Files section.
NOTE: If your sitemap points to a bunch of other XML sitemap files, then you'll need to go into each of those individual sitemap and download each one of them.
When you click the 'Find Missing Pages' button, the tool will compare the URLs from your uploaded XLSX file against those in the XML Sitemap. It will apply any selected filters, then generate a report showing pages missing from the sitemap, orphan pages, and the filtered list of URLs, which you can download as an XLSX file.