Robots Meta Tag & Robots.txt Explained — Beginner's Guide
Your robots meta tag and your robots.txt file are two things that tell Google which pages it is allowed to visit and show in search results. Get them wrong and Google may stop indexing your entire website without warning. This guide explains exactly what they are, how to check if yours are set up correctly, and how to fix them if they are blocking Google — step by step, with no coding experience needed.
What you will learn in this guide
- What a robots meta tag is and what it does to your page in Google
- What a robots.txt file is and how it is different from a robots meta tag
- How to check if your robots meta tag or robots.txt is blocking Google
- What noindex means and when it should and should not be used
- How to fix a robots meta tag that is accidentally blocking your page
- How to create or fix your robots.txt file
- How to upload your fixed files back to your website using FileZilla
- How to ask Google to re-check your page after you have fixed it
1 What is a robots meta tag?
Let us start completely from the beginning. Do not worry if you have never heard of any of this before — by the end of this section it will make perfect sense.
What is Google doing when it visits your website?
Google has a robot — not a physical robot, just a computer program — that visits websites automatically. This robot is called Googlebot. It reads your pages and then stores them in Google's enormous list of websites. This process is called crawling and indexing. When someone searches on Google, Google looks through this list and shows the most relevant pages.
If Googlebot cannot visit your page, or if something on your page tells it to go away, your page will not appear in Google search results at all. It will be invisible.
So what is a robots meta tag?
A robots meta tag is a single line of code that lives inside the <head> section of your HTML file. It is a direct instruction to Googlebot. It tells Google whether it is allowed to show your page in search results.
There are two main versions of this tag:
<meta name="robots" content="index, follow">
This tells Google: "Yes, you can show this page in search results, and yes, you can follow all the links on this page."
<meta name="robots" content="noindex">
This tells Google: "Do not show this page in search results." Your page disappears from Google completely.
What does noindex mean exactly?
Noindex means "do not put this page in your index". The index is Google's giant list of pages. If your page is not in the index, it cannot appear in search results. Full stop. It does not matter how good your content is, how many links you have, or how long the page has been live — a noindex tag makes the page invisible to Google.
When should noindex ever be used?
There are some pages where noindex is correct and intentional — pages you genuinely do not want Google to show. Examples include:
- Thank you pages (after someone submits a form)
- Login or account pages
- Admin pages
- Duplicate pages that exist for technical reasons
For every other page — your homepage, your product pages, your blog posts, your about page, your contact page — noindex should never be there.
2 What is a robots.txt file — and how is it different?
Alongside the robots meta tag, there is a separate thing called a robots.txt file. They do a similar job but work differently. You need to understand both.
What is a robots.txt file?
A robots.txt file is a plain text file that lives at the very root of your website — at the address yourdomain.com/robots.txt. Before Googlebot visits any page on your site, it checks this file first to see if it is allowed in.
Think of it like a sign on the front door of your website. The robots.txt file can say "everyone is welcome" or "certain areas are off limits".
What is the difference between robots.txt and the robots meta tag?
| Robots meta tag | Robots.txt file | |
|---|---|---|
| Where it lives | Inside a single HTML page | A separate file at yourdomain.com/robots.txt |
| What it controls | Whether Google indexes that specific page | Whether Google is even allowed to visit pages at all |
| Who it affects | One page only | The whole website or specific folders |
| Most common mistake | Accidentally adding noindex to an important page | Accidentally blocking all crawlers with Disallow: / |
What does a robots.txt file look like?
The line User-agent: * means "this rule applies to all robots including Googlebot". The line Allow: / means "you are allowed everywhere". The line Disallow: /wp-admin/ means "but not in the admin folder" — which is correct and intentional.
Disallow: / — this single line blocks Googlebot from your entire website. Every page disappears from Google. This happens by accident more often than you would think — especially after someone sets up a new website and forgets to remove a temporary block.
3 Check if Google can currently see your pages
Before you change anything, you need to find out exactly what your pages are currently telling Google. There are three places to check.
Check 1 — Run the AIPageSEO audit
- 1 Go to the AIPageSEO audit tool Open a new browser tab and go to https://aipageseo.com/seo-audit-platform.html.
-
2
Type in your website address and run the audit
Type the full address of the page you want to check — for example
https://yourdomain.com— and click Run Audit. Wait for it to finish. - 3 Look for the Robots section in the results When the audit finishes, scroll down and look for a section called Robots or SEO. If you see any red flags next to NOINDEX, ROBOTS, or ROBOTS_DISALLOW_ALL — those need fixing. The audit will tell you exactly what it found.
Check 2 — Look at your page source directly
-
1
Open your page in a browser
Go to the page you want to check — for example your homepage at
https://yourdomain.com. - 2 View the page source Right-click anywhere on the page. A menu will appear. Click View Page Source. A new tab will open showing the raw HTML code of your page.
-
3
Search for the robots tag
Press
Ctrl + Fon your keyboard. A search box appears. Typerobotsand press Enter. Look at what comes up. If you seecontent="noindex"anywhere — that is a problem. If you seecontent="index, follow"— that is correct. -
4
What if there is no robots tag at all?
If the search finds nothing, your page has no robots meta tag. That is actually fine — Google treats a missing robots tag the same as
index, follow. You only have a problem if the tag exists and saysnoindex.
Check 3 — Look at your robots.txt file
-
1
Open your robots.txt file in a browser
In your browser address bar, type your domain name followed by
/robots.txt. For example:https://yourdomain.com/robots.txtand press Enter. -
2
Read what it says
You will see the contents of your robots.txt file. Look for any line that says
Disallow:. If you seeDisallow: /on its own — that is blocking everything and needs fixing immediately. If you seeDisallow:with nothing after it, orAllow: /, those are fine. - 3 What if the page says 404 or cannot be found? If your robots.txt file does not exist at all, that is not a problem — Google will simply assume everything is allowed. However it is good practice to have one. Section 6 covers how to create a basic robots.txt file from scratch.
4 Fix a robots meta tag that is accidentally blocking your page
If your check in Section 3 found a noindex tag on a page that should be visible in Google, here is how to fix it step by step.
- 1 Open FileZilla on your computer Click the Start button (Windows) or open the Applications folder (Mac) and open FileZilla. If you have not installed it yet, see the Before You Start guide linked above.
- 2 Connect to your server Fill in the four boxes at the top of FileZilla — Host (your domain name), Username, Password, and leave Port blank. Click Quickconnect. The right panel will fill with your server files.
-
3
Open your website folder
In the right panel, double-click
httpdocsif you use Plesk, orpublic_htmlif you use cPanel. You will see your website files listed. -
4
Find the file for the page you need to fix
The file name usually matches the page URL. For your homepage look for
index.html. For your about page look forabout.html. For any other page the file name is usually the last part of the URL. -
5
Download a backup copy first
Right-click the file and choose Download. In the left panel, find the downloaded file, right-click it and choose Rename. Rename it to something like
index-backup.html. This is your safety copy. - 6 Download the working copy to your Desktop Right-click the same file again in the right panel and choose Download. This time make sure your Desktop is selected in the left panel so the file saves somewhere easy to find.
- 7 Open the file in Notepad++ Open Notepad++. At the top of the screen click File, then click Open. A window appears. Navigate to your Desktop, click on the HTML file you just downloaded, and click Open. The file will open in Notepad++ showing lots of code.
-
8
Find the robots meta tag
Press
Ctrl + Fon your keyboard. A small search box appears at the bottom of Notepad++. Typenoindexand press Enter. Notepad++ will jump straight to that line. You will see something like this:<meta name="robots" content="noindex">This is the line you need to fix.
-
9
Change noindex to index, follow
Click on the word
noindexin that line. Select it by double-clicking on it. Then typeindex, followto replace it. The line should now look like this:<meta name="robots" content="index, follow"> -
10
Double-check what you have done
Look at the line you just changed. It should say
content="index, follow". There should be nonoindexanywhere on that line. If it looks right, move on to the next step. -
11
Save the file
Press
Ctrl + Son your keyboard. The small red dot on the file tab at the top of Notepad++ will disappear. That means the file is saved.
5 Fix a robots.txt file that is blocking Google
If your check in Section 3 found that your robots.txt file contains Disallow: / — meaning it is blocking Google from your whole website — here is how to fix it.
- 1 Open FileZilla and connect to your server Open FileZilla, fill in the Host, Username and Password boxes, leave Port blank, and click Quickconnect.
-
2
Find your robots.txt file
In the right panel, double-click
httpdocs(Plesk) orpublic_html(cPanel). Look for a file calledrobots.txt. It will be in the main folder — not inside any other folder. -
3
Download a backup and a working copy
Right-click
robots.txtand choose Download. In the left panel rename it torobots-backup.txt. Then download it again to your Desktop as your working copy. -
4
Open robots.txt in Notepad++
Open Notepad++. Click File → Open. Navigate to your Desktop and open the
robots.txtfile. The contents will appear — it will be short, probably just a few lines. -
5
Find the problem line
Look for a line that says
Disallow: /on its own. That forward slash after the colon is blocking everything. You need to either delete that line or change it. -
6
Fix the file
Replace the entire contents of the file with this correct version. Select everything with
Ctrl + A, delete it, then type this:User-agent: * Allow: / Disallow: /wp-admin/ Sitemap: https://yourdomain.com/sitemap.xmlReplace
yourdomain.comwith your actual domain name. If you do not use WordPress, remove theDisallow: /wp-admin/line entirely. -
7
Save the file
Press
Ctrl + Sto save.
6 Create a robots.txt file if you do not have one
If your robots.txt file does not exist at all, here is how to create one from scratch using Notepad++.
- 1 Open Notepad++ Click Start (Windows) or open Applications (Mac) and open Notepad++. A blank editing window will open with a tab at the top called "new 1".
-
2
Type the contents of your robots.txt file
Click inside the blank editor and type the following exactly as shown, replacing
yourdomain.comwith your actual domain:User-agent: * Allow: / Disallow: /wp-admin/ Sitemap: https://yourdomain.com/sitemap.xmlIf you do not use WordPress, leave out the
Disallow: /wp-admin/line. If you do not have a sitemap yet, leave out the Sitemap line too — you can add it later. -
3
Save the file with the correct name
Press
Ctrl + S. A save dialog box will appear asking you to choose a location and filename. Navigate to your Desktop. In the File Name box, typerobots.txtexactly — including the .txt extension. In the Save As Type dropdown, choose All types (*.*). Then click Save.
7 Upload your fixed files back to your server
Whether you fixed an HTML file, a robots.txt file, or both — the upload process is the same for each one.
- 1 Switch back to FileZilla Click on the FileZilla window in your taskbar. If it has disconnected, fill in the boxes again and click Quickconnect.
-
2
Make sure you are in the right folder on the server
In the right panel, make sure you can see your website files — the ones in
httpdocsorpublic_html. If you just see a single/at the top, double-click into the correct folder. -
3
Find your fixed file in the left panel
In the left panel, navigate to your Desktop. You should see your fixed file there — either an HTML file like
index.htmlor yourrobots.txtfile. - 4 Upload the file Right-click the file in the left panel and choose Upload. A box may appear asking if you want to overwrite the existing file. Click OK or Overwrite. A progress bar will appear at the bottom of FileZilla. When it disappears, the upload is complete.
-
5
Check it worked — for HTML files
Open a browser tab and go to your page. Right-click and choose View Page Source. Press
Ctrl + Fand search forrobots. You should now seecontent="index, follow"and no sign ofnoindexanywhere. -
6
Check it worked — for robots.txt
Open a browser tab and go to
https://yourdomain.com/robots.txt. You should see your new correct robots.txt file displayed. Check it saysAllow: /and does not sayDisallow: /on its own.
8 Tell Google to re-check your page
After you have fixed your robots settings, Google will not know about it straight away. Google visits websites on its own schedule — which could be days or even weeks. You can speed this up by telling Google directly using a free tool called Google Search Console.
If you have Google Search Console set up
- 1 Log in to Google Search Console Go to https://search.google.com/search-console and sign in with the Google account connected to your website.
-
2
Use the URL Inspection tool
At the top of the screen there is a search bar. Type the full address of the page you fixed — for example
https://yourdomain.com— and press Enter. - 3 Click Request Indexing You will see information about that page. Look for a button that says Request Indexing and click it. Google will confirm the request has been submitted. This tells Google to visit your page again as soon as possible and pick up the changes you made.
- 4 Wait 24 to 48 hours After a day or two, go back to Google Search Console and check the URL again. It should now show the page as indexed. If your page was previously missing from Google search results, it should start appearing again within a few days.
If you do not have Google Search Console yet
Google Search Console is free and every website owner should set it up. Without it, Google will re-check your page eventually on its own — but it could take longer. Search for "how to set up Google Search Console" and follow Google's official guide. It takes about 15 minutes.
9 Common mistakes to avoid
-
⚠
Thinking a missing robots tag is a problem
If your page has no robots meta tag at all, that is not a problem. Google treats a missing robots tag exactly the same as
index, follow. You only need to add or change the tag if you have found anoindexvalue that should not be there. -
⚠
Accidentally putting Disallow: / in your robots.txt
This single line blocks Googlebot from your entire website. It is an easy mistake to make when setting up a new site. Always check your robots.txt file by visiting
yourdomain.com/robots.txtin a browser after making any changes. - ⚠ Blocking CSS and JavaScript files in robots.txt Some older guides suggest blocking your CSS and JavaScript folders in robots.txt to save crawl budget. This is wrong. Googlebot needs to read your CSS and JavaScript to understand how your page looks and works. Blocking them can cause Google to rank your page lower because it cannot render it properly.
- ⚠ Removing noindex from a page that needs it Before removing noindex from any page, make sure it is a page you actually want Google to show in search results. Login pages, thank you pages, duplicate pages and admin pages should keep their noindex tags. Only remove noindex from pages that should genuinely appear in Google search results.
- ⚠ Expecting Google to update immediately Even after you fix your robots settings and request indexing in Google Search Console, it may take several days for your pages to reappear in Google search results. This is normal. If your pages were blocked for a long time, it can take a bit longer for them to recover their previous rankings.
-
⚠
Forgetting to save before uploading
If you forget to press
Ctrl + Sin Notepad++ before going back to FileZilla, you will upload the unchanged file. Always check the red dot on the tab is gone before uploading.