First Look at Next-Gen Bulk Index Checker Capabilities
Intent
In today's digital age, the importance of SEO cannot be overstated, and one of the most critical tools in an SEO professionals arsenal is the bulk index checker. The recent advancements in next-gen bulk index checker capabilities have been a game changer for how businesses optimize their online presence. Lets dive into what these new capabilities look like and why theyre making such a big splash!
First off, the next-generation bulk index checkers have significantly improved in terms of speed and accuracy. This is crucial because time is money, especially in the digital marketing world. With previous versions, you might have had to wait for hours, or even days, to get results back. But now, results can be accessed almost in real-time! Imagine being able to check the indexing status of thousands of pages within minutes.
Disavow
Code
Technical
Subdomain
Optimization
Dashboard
HTML
Monitor
Freshness
Cache
Query
Fix
Content
That's efficiency at its best!
Another standout feature of these updated tools is the enhanced user interface.
HTML
Optimization
Dashboard
CTR
Structure
Hosting
Ranking
Frequency
Structured
Disavow
HTML
Monitor
The older systems were often clunky and not very user-friendly. undefined undefined undefined. Structure It was easy to get lost in the myriad of tabs and buttons.
Nofollow
Technical
Subdomain
Optimization
Dashboard
Cache
Query
Fix
Content
Latency
Speed
Offpage
Metadata
Manual
Ranking However, the new interfaces have been streamlined to ensure that even the least tech-savvy users can navigate through the processes effortlessly. The simplicity and intuitiveness of the design now mean that more people can utilize this powerful tool without needing specialized training.
Additionally, the next-gen bulk index checkers now offer deeper insights into the data they collect. They dont just tell you whether a page is indexed; they provide detailed reports on why certain pages might not be getting indexed and suggest actionable steps to rectify these issues. Cache This kind of detailed analysis was something that was sorely missing in the past but is incredibly beneficial.
What's also exciting is the integration of AI technologies into these tools. HTML AI helps in predicting indexing issues before they become a significant problem. CTR For instance, AI can analyze patterns in pages that frequently fail to index and identify common characteristics among them. HostingIntent This predictive capability allows users to proactively make adjustments to improve their SEO strategies.
However, these tools are not without their challenges. The accuracy of the results can sometimes be a concern.
UX
HTML
Monitor
Freshness
Cache
Query
Fix
Content
Latency
Speed
Offpage
Metadata
Manual
Alert
Despite improvements, there are instances where the data might not be 100% reliable due to the complexities involved in how search engines index pages. Its crucial for users to not solely rely on these tools but to use them in conjunction with other SEO techniques.
In conclusion, the first look at next-gen bulk index checker capabilities suggests a bright future for SEO professionals and website managers!
Intent
Freshness
Cache
Query
Fix
Content
Latency
Speed
Offpage
Metadata
Manual
Alert
Code
Technical
Subdomain
The improvements in speed, user interface, depth of insights, and the integration of AI are set to revolutionize how we understand and enhance the visibility of our digital content.
Monitor
Query
Fix
Content
Latency
Speed
Offpage
Metadata
Manual
Alert
Code
However, its important to remember that these tools, while powerful, are just one part of a broader SEO strategy.
Structured
Metadata
Manual
Alert
Code
Technical
Subdomain
Optimization
Dashboard
Structured
Disavow
They must be used wisely and in balance with other SEO practices to achieve the best results. So, lets embrace these new capabilities and push towards more optimized and successful online presences! Query
Meta search engines reuse the indices of other services and do not store a local index whereas cache-based search engines permanently store the index along with the corpus. Unlike full-text indices, partial-text services restrict the depth indexed to reduce index size. Larger services typically perform indexing at a predetermined time interval due to the required time and processing costs, while agent-based search engines index in real time.
The purpose of storing an index is to optimize speed and performance in finding relevant documents for a search query. Without an index, the search engine would scan every document in the corpus, which would require considerable time and computing power. For example, while an index of 10,000 documents can be queried within milliseconds, a sequential scan of every word in 10,000 large documents could take hours. The additional computer storage required to store the index, as well as the considerable increase in the time required for an update to take place, are traded off for the time saved during information retrieval.
Major factors in designing a search engine's architecture include:
Merge factors
How data enters the index, or how words or subject features are added to the index during text corpus traversal, and whether multiple indexers can work asynchronously. The indexer must first check whether it is updating old content or adding new content. Traversal typically correlates to the data collection policy. Search engine index merging is similar in concept to the SQL Merge command and other merge algorithms.[4]
Storage techniques
How to store the index data, that is, whether information should be data compressed or filtered.
How quickly a word can be found in the inverted index. The speed of finding an entry in a data structure, compared with how quickly it can be updated or removed, is a central focus of computer science.
How important it is for the service to be reliable. Issues include dealing with index corruption, determining whether bad data can be treated in isolation, dealing with bad hardware, partitioning, and schemes such as hash-based or composite partitioning,[6] as well as replication.
Figuratively structured like a tree, supports linear time lookup. Built by storing the suffixes of words. The suffix tree is a type of trie. Tries support extendible hashing, which is important for search engine indexing.[7] Used for searching for patterns in DNA sequences and clustering. A major drawback is that storing a word in the tree may require space beyond that required to store the word itself.[8] An alternate representation is a suffix array, which is considered to require less virtual memory and supports data compression such as the BWT algorithm.
A major challenge in the design of search engines is the management of serial computing processes. There are many opportunities for race conditions and coherent faults. For example, a new document is added to the corpus and the index must be updated, but the index simultaneously needs to continue responding to search queries. This is a collision between two competing tasks. Consider that authors are producers of information, and a web crawler is the consumer of this information, grabbing the text and storing it in a cache (or corpus). The forward index is the consumer of the information produced by the corpus, and the inverted index is the consumer of information produced by the forward index. This is commonly referred to as a producer-consumer model. The indexer is the producer of searchable information and users are the consumers that need to search. The challenge is magnified when working with distributed storage and distributed processing. In an effort to scale with larger amounts of indexed information, the search engine's architecture may involve distributed computing, where the search engine consists of several machines operating in unison. This increases the possibilities for incoherency and makes it more difficult to maintain a fully synchronized, distributed, parallel architecture.[13]
Many search engines incorporate an inverted index when evaluating a search query to quickly locate documents containing the words in a query and then rank these documents by relevance. Because the inverted index stores a list of the documents containing each word, the search engine can use direct access to find the documents associated with each word in the query in order to retrieve the matching documents quickly. The following is a simplified illustration of an inverted index:
This index can only determine whether a word exists within a particular document, since it stores no information regarding the frequency and position of the word; it is therefore considered to be a Boolean index. Such an index determines which documents match a query but does not rank matched documents. In some designs the index includes additional information such as the frequency of each word in each document or the positions of a word in each document.[14] Position information enables the search algorithm to identify word proximity to support searching for phrases; frequency can be used to help in ranking the relevance of documents to the query. Such topics are the central research focus of information retrieval.
The inverted index is a sparse matrix, since not all words are present in each document. To reduce computer storage memory requirements, it is stored differently from a two dimensional array. The index is similar to the term document matrices employed by latent semantic analysis. The inverted index can be considered a form of a hash table. In some cases the index is a form of a binary tree, which requires additional storage but may reduce the lookup time. In larger indices the architecture is typically a distributed hash table.[15]
Implementation of Phrase Search Using an Inverted Index
For phrase searching, a specialized form of an inverted index called a positional index is used. A positional index not only stores the ID of the document containing the token but also the exact position(s) of the token within the document in the postings list. The occurrences of the phrase specified in the query are retrieved by navigating these postings list and identifying the indexes at which the desired terms occur in the expected order (the same as the order in the phrase). So if we are searching for occurrence of the phrase "First Witch", we would:
Retrieve the postings list for "first" and "witch"
Identify the first time that "witch" occurs after "first"
Check that this occurrence is immediately after the occurrence of "first".
If not, continue to the next occurrence of "first".
The postings lists can be navigated using a binary search in order to minimize the time complexity of this procedure.[16]
The inverted index is filled via a merge or rebuild. A rebuild is similar to a merge but first deletes the contents of the inverted index. The architecture may be designed to support incremental indexing,[17] where a merge identifies the document or documents to be added or updated and then parses each document into words. For technical accuracy, a merge conflates newly indexed documents, typically residing in virtual memory, with the index cache residing on one or more computer hard drives.
After parsing, the indexer adds the referenced document to the document list for the appropriate words. In a larger search engine, the process of finding each word in the inverted index (in order to report that it occurred within a document) may be too time consuming, and so this process is commonly split up into two parts, the development of a forward index and a process which sorts the contents of the forward index into the inverted index. The inverted index is so named because it is an inversion of the forward index.
The forward index stores a list of words for each document. The following is a simplified form of the forward index:
Forward Index
Document
Words
Document 1
the,cow,says,moo
Document 2
the,cat,and,the,hat
Document 3
the,dish,ran,away,with,the,spoon
The rationale behind developing a forward index is that as documents are parsed, it is better to intermediately store the words per document. The delineation enables asynchronous system processing, which partially circumvents the inverted index update bottleneck.[18] The forward index is sorted to transform it to an inverted index. The forward index is essentially a list of pairs consisting of a document and a word, collated by the document. Converting the forward index to an inverted index is only a matter of sorting the pairs by the words. In this regard, the inverted index is a word-sorted forward index.
Generating or maintaining a large-scale search engine index represents a significant storage and processing challenge. Many search engines utilize a form of compression to reduce the size of the indices on disk.[19] Consider the following scenario for a full text, Internet search engine.
It takes 8 bits (or 1 byte) to store a single character. Some encodings use 2 bytes per character[20][21]
The average number of characters in any given word on a page may be estimated at 5 (Wikipedia:Size comparisons)
Given this scenario, an uncompressed index (assuming a non-conflated, simple, index) for 2 billion web pages would need to store 500 billion word entries. At 1 byte per character, or 5 bytes per word, this would require 2500 gigabytes of storage space alone.[citation needed] This space requirement may be even larger for a fault-tolerant distributed storage architecture. Depending on the compression technique chosen, the index can be reduced to a fraction of this size. The tradeoff is the time and processing power required to perform compression and decompression.[citation needed]
Notably, large scale search engine designs incorporate the cost of storage as well as the costs of electricity to power the storage. Thus compression is a measure of cost.[citation needed]
Natural language processing is the subject of continuous research and technological improvement. Tokenization presents many challenges in extracting the necessary information from documents for indexing to support quality searching. Tokenization for indexing involves multiple technologies, the implementation of which are commonly kept as corporate secrets.[citation needed]
Native English speakers may at first consider tokenization to be a straightforward task, but this is not the case with designing a multilingual indexer. In digital form, the texts of other languages such as Chinese or Japanese represent a greater challenge, as words are not clearly delineated by whitespace. The goal during tokenization is to identify words for which users will search. Language-specific logic is employed to properly identify the boundaries of words, which is often the rationale for designing a parser for each language supported (or for groups of languages with similar boundary markers and syntax).
Language ambiguity
To assist with properly ranking matching documents, many search engines collect additional information about each word, such as its language or lexical category (part of speech). These techniques are language-dependent, as the syntax varies among languages. Documents do not always clearly identify the language of the document or represent it accurately. In tokenizing the document, some search engines attempt to automatically identify the language of the document.
Diverse file formats
In order to correctly identify which bytes of a document represent characters, the file format must be correctly handled. Search engines that support multiple file formats must be able to correctly open and access the document and be able to tokenize the characters of the document.
Faulty storage
The quality of the natural language data may not always be perfect. An unspecified number of documents, particularly on the Internet, do not closely obey proper file protocol. Binary characters may be mistakenly encoded into various parts of a document. Without recognition of these characters and appropriate handling, the index quality or indexer performance could degrade.
Unlike literate humans, computers do not understand the structure of a natural language document and cannot automatically recognize words and sentences. To a computer, a document is only a sequence of bytes. Computers do not 'know' that a space character separates words in a document. Instead, humans must program the computer to identify what constitutes an individual or distinct word referred to as a token. Such a program is commonly called a tokenizer or parser or lexer. Many search engines, as well as other natural language processing software, incorporate specialized programs for parsing, such as YACC or Lex.
During tokenization, the parser identifies sequences of characters that represent words and other elements, such as punctuation, which are represented by numeric codes, some of which are non-printing control characters. The parser can also identify entities such as email addresses, phone numbers, and URLs. When identifying each token, several characteristics may be stored, such as the token's case (upper, lower, mixed, proper), language or encoding, lexical category (part of speech, like 'noun' or 'verb'), position, sentence number, sentence position, length, and line number.
If the search engine supports multiple languages, a common initial step during tokenization is to identify each document's language; many of the subsequent steps are language dependent (such as stemming and part of speech tagging). Language recognition is the process by which a computer program attempts to automatically identify, or categorize, the language of a document. Other names for language recognition include language classification, language analysis, language identification, and language tagging. Automated language recognition is the subject of ongoing research in natural language processing. Finding which language the words belongs to may involve the use of a language recognition chart.
If the search engine supports multiple document formats, documents must be prepared for tokenization. The challenge is that many document formats contain formatting information in addition to textual content. For example, HTML documents contain HTML tags, which specify formatting information such as new line starts, bold emphasis, and font size or style. If the search engine were to ignore the difference between content and 'markup', extraneous information would be included in the index, leading to poor search results. Format analysis is the identification and handling of the formatting content embedded within documents which controls the way the document is rendered on a computer screen or interpreted by a software program. Format analysis is also referred to as structure analysis, format parsing, tag stripping, format stripping, text normalization, text cleaning and text preparation. The challenge of format analysis is further complicated by the intricacies of various file formats. Certain file formats are proprietary with very little information disclosed, while others are well documented. Common, well-documented file formats that many search engines support include:
Options for dealing with various formats include using a publicly available commercial parsing tool that is offered by the organization which developed, maintains, or owns the format, and writing a custom parser.
Some search engines support inspection of files that are stored in a compressed or encrypted file format. When working with a compressed format, the indexer first decompresses the document; this step may result in one or more files, each of which must be indexed separately. Commonly supported compressed file formats include:
TAR.Z, TAR.GZ or TAR.BZ2 - Unix archive files compressed with Compress, GZIP or BZIP2
Format analysis can involve quality improvement methods to avoid including 'bad information' in the index. Content can manipulate the formatting information to include additional content. Examples of abusing document formatting for spamdexing:
Including hundreds or thousands of words in a section that is hidden from view on the computer screen, but visible to the indexer, by use of formatting (e.g. hidden "div" tag in HTML, which may incorporate the use of CSS or JavaScript to do so).
Setting the foreground font color of words to the same as the background color, making words hidden on the computer screen to a person viewing the document, but not hidden to the indexer.
Some search engines incorporate section recognition, the identification of major parts of a document, prior to tokenization. Not all the documents in a corpus read like a well-written book, divided into organized chapters and pages. Many documents on the web, such as newsletters and corporate reports, contain erroneous content and side-sections that do not contain primary material (that which the document is about). For example, articles on the Wikipedia website display a side menu with links to other web pages. Some file formats, like HTML or PDF, allow for content to be displayed in columns. Even though the content is displayed, or rendered, in different areas of the view, the raw markup content may store this information sequentially. Words that appear sequentially in the raw source content are indexed sequentially, even though these sentences and paragraphs are rendered in different parts of the computer screen. If search engines index this content as if it were normal content, the quality of the index and search quality may be degraded due to the mixed content and improper word proximity. Two primary problems are noted:
Content in different sections is treated as related in the index when in reality it is not
Organizational side bar content is included in the index, but the side bar content does not contribute to the meaning of the document, and the index is filled with a poor representation of its documents.
Section analysis may require the search engine to implement the rendering logic of each document, essentially an abstract representation of the actual document, and then index the representation instead. For example, some content on the Internet is rendered via JavaScript. If the search engine does not render the page and evaluate the JavaScript within the page, it would not 'see' this content in the same way and would index the document incorrectly. Given that some search engines do not bother with rendering issues, many web page designers avoid displaying content via JavaScript or use the NoscriptArchived 2020-07-07 at the Wayback Machine tag to ensure that the web page is indexed properly. At the same time, this fact can also be exploited to cause the search engine indexer to 'see' different content than the viewer.
Indexing often has to recognize the HTML tags to organize priority. Indexing low priority to high margin to labels like strong and link to optimize the order of priority if those labels are at the beginning of the text could not prove to be relevant. Some indexers like Google and Bing ensure that the search engine does not take the large texts as relevant source due to strong type system compatibility.[22]
Meta tag indexing plays an important role in organizing and categorizing web content. Specific documents often contain embedded meta information such as author, keywords, description, and language. For HTML pages, the meta tag contains keywords which are also included in the index. Earlier Internet search engine technology would only index the keywords in the meta tags for the forward index; the full document would not be parsed. At that time full-text indexing was not as well established, nor was computer hardware able to support such technology. The design of the HTML markup language initially included support for meta tags for the very purpose of being properly and easily indexed, without requiring tokenization.[23]
As the Internet grew through the 1990s, many brick-and-mortar corporations went 'online' and established corporate websites. The keywords used to describe webpages (many of which were corporate-oriented webpages similar to product brochures) changed from descriptive to marketing-oriented keywords designed to drive sales by placing the webpage high in the search results for specific search queries. The fact that these keywords were subjectively specified was leading to spamdexing, which drove many search engines to adopt full-text indexing technologies in the 1990s. Search engine designers and companies could only place so many 'marketing keywords' into the content of a webpage before draining it of all interesting and useful information. Given that conflict of interest with the business goal of designing user-oriented websites which were 'sticky', the customer lifetime value equation was changed to incorporate more useful content into the website in hopes of retaining the visitor. In this sense, full-text indexing was more objective and increased the quality of search engine results, as it was one more step away from subjective control of search engine result placement, which in turn furthered research of full-text indexing technologies.[citation needed]
In desktop search, many solutions incorporate meta tags to provide a way for authors to further customize how the search engine will index content from various files that is not evident from the file content. Desktop search is more under the control of the user, while Internet search engines must focus more on the full text index.[citation needed]
^Clarke, C., Cormack, G.: Dynamic Inverted Indexes for a Distributed Full-Text Retrieval System. TechRep MT-95-01, University of Waterloo, February 1995.
^Charles E. Jacobs, Adam Finkelstein, David H. Salesin. Fast Multiresolution Image Querying. Department of Computer Science and Engineering, University of Washington. 1995. Verified Dec 2006
^Brown, E.W.: Execution Performance Issues in Full-Text Information Retrieval. Computer Science Department, University of Massachusetts Amherst, Technical Report 95-81, October 1995.
^Cutting, D., Pedersen, J.: Optimizations for dynamic inverted index maintenance. Proceedings of SIGIR, 405-411, 1990.
^Gusfield, Dan (1999) [1997]. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. US: Cambridge University Press. ISBN0-521-58519-8..
^C. C. Foster, Information retrieval: information storage and retrieval using AVL trees, Proceedings of the 1965 20th national conference, p.192-205, August 24–26, 1965, Cleveland, Ohio, United States
^Landauer, W. I.: The balanced tree and its utilization in information retrieval. IEEE Trans. on Electronic Computers, Vol. EC-12, No. 6, December 1963.
^Büttcher, Stefan; Clarke, Charles L. A.; Cormack, Gordon V. (2016). Information retrieval: implementing and evaluating search engines (First MIT Press paperback ed.). Cambridge, Massachusetts London, England: The MIT Press. ISBN978-0-262-52887-0.
^Tomasic, A., et al.: Incremental Updates of Inverted Lists for Text Document Retrieval. Short Version of Stanford University Computer Science Technical Note STAN-CS-TN-93-1, December, 1993.
Donald E. Knuth. The art of computer programming, volume 3: (2nd ed.) sorting and searching, Addison Wesley Longman Publishing Co. Redwood City, CA, 1998.
Gerard Salton. Michael J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, Inc., New York, NY, 1986.
Gerard Salton. Lesk, M.E.: Computer evaluation of indexing and text processing. Journal of the ACM. January 1968.
Gerard Salton. The SMART Retrieval System - Experiments in Automatic Document Processing. Prentice Hall Inc., Englewood Cliffs, 1971.
Gerard Salton. The Transformation, Analysis, and Retrieval of Information by Computer, Addison-Wesley, Reading, Mass., 1989.
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Chapter 8. ACM Press 1999.
G. K. Zipf. Human Behavior and the Principle of Least Effort. Addison-Wesley, 1949.
Adelson-Velskii, G.M., Landis, E. M.: An information organization algorithm. DANSSSR, 146, 263-266 (1962).
Edward H. Sussenguth Jr., Use of tree structures for processing files, Communications of the ACM, v.6 n.5, p. 272-279, May 1963
Harman, D.K., et al.: Inverted files. In Information Retrieval: Data Structures and Algorithms, Prentice-Hall, pp 28–43, 1992.
Lim, L., et al.: Characterizing Web Document Change, LNCS 2118, 133–146, 2001.
Lim, L., et al.: Dynamic Maintenance of Web Indexes Using Landmarks. Proc. of the 12th W3 Conference, 2003.
Moffat, A., Zobel, J.: Self-Indexing Inverted Files for Fast Text Retrieval. ACM TIS, 349–379, October 1996, Volume 14, Number 4.
Mehlhorn, K.: Data Structures and Efficient Algorithms, Springer Verlag, EATCS Monographs, 1984.
Mehlhorn, K., Overmars, M.H.: Optimal Dynamization of Decomposable Searching Problems. IPL 12, 93–98, 1981.
Mehlhorn, K.: Lower Bounds on the Efficiency of Transforming Static Data Structures into Dynamic Data Structures. Math. Systems Theory 15, 1–16, 1981.
Koster, M.: ALIWEB: Archie-Like indexing in the Web. Computer Networks and ISDN Systems, Vol. 27, No. 2 (1994) 175-182 (also see Proc. First Int'l World Wide Web Conf., Elsevier Science, Amsterdam, 1994, pp. 175–182)
Ian H Witten, Alistair Moffat, and Timothy C. Bell. Managing Gigabytes: Compressing and Indexing Documents and Images. New York: Van Nostrand Reinhold, 1994.
A. Emtage and P. Deutsch, "Archie--An Electronic Directory Service for the Internet." Proc. Usenix Winter 1992 Tech. Conf., Usenix Assoc., Berkeley, Calif., 1992, pp. 93–110.
D. Cutting and J. Pedersen. "Optimizations for Dynamic Inverted Index Maintenance." Proceedings of the 13th International Conference on Research and Development in Information Retrieval, pp. 405–411, September 1990.
Google Search (also known simply as Google or google.com) is a search engine operated by Google. It allows users to search for information on the Web by entering keywords or phrases. Google Search uses algorithms to analyze and rank websites based on their relevance to the search query. Google Search is the most-visited website in the world. As of 2025, Google Search has a 90% share of the global search engine market.[3] Approximately 24.1% of Google's monthly global traffic comes from the United States, 5.6% from India, 5.5% from Japan, 4.8% from Brazil, and 3.7% from the United Kingdom according to data provided by Similarweb. The same source reports that 58% of users are male and 42% are female.[4]
The order of search results returned by Google is based, in part, on a priority rank system called "PageRank". Google Search also provides many different options for customized searches, using symbols to include, exclude, specify or require certain search behavior, and offers specialized interactive experiences, such as flight status and package tracking, weather forecasts, currency, unit, and time conversions, word definitions, and more.
Analysis of the frequency of search terms may indicate economic, social and health trends.[10] Data about the frequency of use of search terms on Google can be openly inquired via Google Trends and have been shown to correlate with flu outbreaks and unemployment levels, and provide the information faster than traditional reporting methods and surveys. As of mid-2016, Google's search engine has begun to rely on deep neural networks.[11]
In August 2024, a US judge in Virginia ruled that Google held an illegal monopoly over Internet search and search advertising.[12][13] The court found that Google maintained its market dominance by paying large amounts to phone-makers and browser-developers to make Google its default search engine.[13] In April 2025, the trial to determine which remedies sought by the Department of Justice would be imposed to address Google's illegal monopoly, which could include breaking up the company and preventing it from using its data to secure dominance in the AI sector.[needs update][14]
Despite Google search's immense index, sources generally assume that Google is only indexing less than 5% of the total Internet, with the rest belonging to the deep web, inaccessible through its search tools.[15][20][21]
In 2012, Google changed its search indexing tools to demote sites that had been accused of piracy.[22] In October 2016, Gary Illyes, a webmaster trends analyst with Google, announced that the search engine would be making a separate, primary web index dedicated for mobile devices, with a secondary, less up-to-date index for desktop use. The change was a response to the continued growth in mobile usage, and a push for web developers to adopt a mobile-friendly version of their websites.[23][24] In December 2017, Google began rolling out the change, having already done so for multiple websites.[25]
In August 2009, Google invited web developers to test a new search architecture, codenamed "Caffeine", and give their feedback. The new architecture provided no visual differences in the user interface, but added significant speed improvements and a new "under-the-hood" indexing infrastructure. The move was interpreted in some quarters as a response to Microsoft's recent release of an upgraded version of its own search service, renamed Bing, as well as the launch of Wolfram Alpha, a new search engine based on "computational knowledge".[26][27] Google announced completion of "Caffeine" on June 8, 2010, claiming 50% fresher results due to continuous updating of its index.[28]
With "Caffeine", Google moved its back-end indexing system away from MapReduce and onto Bigtable, the company's distributed database platform.[29][30]
In August 2018, Danny Sullivan from Google announced a broad core algorithm update. As per current analysis done by the industry leaders Search Engine Watch and Search Engine Land, the update was to drop down the medical and health-related websites that were not user friendly and were not providing good user experience. This is why the industry experts named it "Medic".[31]
Google reserves very high standards for YMYL (Your Money or Your Life) pages. This is because misinformation can affect users financially, physically, or emotionally. Therefore, the update targeted particularly those YMYL pages that have low-quality content and misinformation. This resulted in the algorithm targeting health and medical-related websites more than others. However, many other websites from other industries were also negatively affected.[32]
By 2012, it handled more than 3.5 billion searches per day.[33] In 2013 the European Commission found that Google Search favored Google's own products, instead of the best result for consumers' needs.[34] In February 2015 Google announced a major change to its mobile search algorithm which would favor mobile friendly over other websites. Nearly 60% of Google searches come from mobile phones. Google says it wants users to have access to premium quality websites. Those websites which lack a mobile-friendly interface would be ranked lower and it is expected that this update will cause a shake-up of ranks[needs update]. Businesses who fail to update their websites accordingly could see a dip in their regular websites traffic.[35]
Google's rise was largely due to a patented algorithm called PageRank which helps rank web pages that match a given search string.[36] When Google was a Stanford research project, it was nicknamed BackRub because the technology checks backlinks to determine a site's importance. Other keyword-based methods to rank search results, used by many search engines that were once more popular than Google, would check how often the search terms occurred in a page, or how strongly associated the search terms were within each resulting page. The PageRank algorithm instead analyzes human-generated links assuming that web pages linked from many important pages are also important. The algorithm computes a recursive score for pages, based on the weighted sum of other pages linking to them. PageRank is thought to correlate well with human concepts of importance. In addition to PageRank, Google, over the years, has added many other secret criteria for determining the ranking of resulting pages. This is reported to comprise over 250 different indicators,[37][38] the specifics of which are kept secret to avoid difficulties created by scammers and help Google maintain an edge over its competitors globally.
PageRank was influenced by a similar page-ranking and site-scoring algorithm earlier used for RankDex, developed by Robin Li in 1996. Larry Page's patent for PageRank filed in 1998 includes a citation to Li's earlier patent. Li later went on to create the Chinese search engine Baidu in 2000.[39][40]
In a potential hint of Google's future direction of their Search algorithm, Google's then chief executive Eric Schmidt, said in a 2007 interview with the Financial Times: "The goal is to enable Google users to be able to ask the question such as 'What shall I do tomorrow?' and 'What job shall I take?'".[41] Schmidt reaffirmed this during a 2010 interview with The Wall Street Journal: "I actually think most people don't want Google to answer their questions, they want Google to tell them what they should be doing next."[42]
Because Google is the most popular search engine, many webmasters attempt to influence their website's Google rankings. An industry of consultants has arisen to help websites increase their rankings on Google and other search engines. This field, called search engine optimization, attempts to discern patterns in search engine listings, and then develop a methodology for improving rankings to draw more searchers to their clients' sites. Search engine optimization encompasses both "on page" factors (like body copy, title elements, H1 heading elements and image alt attribute values) and Off Page Optimization factors (like anchor text and PageRank). The general idea is to affect Google's relevance algorithm by incorporating the keywords being targeted in various places "on page", in particular the title element and the body copy (note: the higher up in the page, presumably the better its keyword prominence and thus the ranking). Too many occurrences of the keyword, however, cause the page to look suspect to Google's spam checking algorithms. Google has published guidelines for website owners who would like to raise their rankings when using legitimate optimization consultants.[43] It has been hypothesized, and, allegedly, is the opinion of the owner of one business about which there have been numerous complaints, that negative publicity, for example, numerous consumer complaints, may serve as well to elevate page rank on Google Search as favorable comments.[44] The particular problem addressed in The New York Times article, which involved DecorMyEyes, was addressed shortly thereafter by an undisclosed fix in the Google algorithm. According to Google, it was not the frequently published consumer complaints about DecorMyEyes which resulted in the high ranking but mentions on news websites of events which affected the firm such as legal actions against it. Google Search Console helps to check for websites that use duplicate or copyright content.[45]
In 2013, Google significantly upgraded its search algorithm with "Hummingbird". Its name was derived from the speed and accuracy of the hummingbird.[46] The change was announced on September 26, 2013, having already been in use for a month.[47] "Hummingbird" places greater emphasis on natural language queries, considering context and meaning over individual keywords.[46] It also looks deeper at content on individual pages of a website, with improved ability to lead users directly to the most appropriate page rather than just a website's homepage.[48] The upgrade marked the most significant change to Google search in years, with more "human" search interactions[49] and a much heavier focus on conversation and meaning.[46] Thus, web developers and writers were encouraged to optimize their sites with natural writing rather than forced keywords, and make effective use of technical web development for on-site navigation.[50]
In 2023, drawing on internal Google documents disclosed as part of the United States v. Google LLC (2020) antitrust case, technology reporters claimed that Google Search was "bloated and overmonetized"[51] and that the "semantic matching" of search queries put advertising profits before quality.[52]Wired withdrew Megan Gray's piece after Google complained about alleged inaccuracies, while the author reiterated that «As stated in court, "A goal of Project Mercury was to increase commercial queries"».[53]
In March 2024, Google announced a significant update to its core search algorithm and spam targeting, which is expected to wipe out 40 percent of all spam results.[54] On March 20th, it was confirmed that the roll out of the spam update was complete.[55]
On September 10, 2024, the European-based EU Court of Justice found that Google held an illegal monopoly with the way the company showed favoritism to its shopping search, and could not avoid paying €2.4 billion.[56] The EU Court of Justice referred to Google's treatment of rival shopping searches as "discriminatory" and in violation of the Digital Markets Act.[56]
At the top of the search page, the approximate result count and the response time two digits behind decimal is noted. Of search results, page titles and URLs, dates, and a preview text snippet for each result appears. Along with web search results, sections with images, news, and videos may appear.[57] The length of the previewed text snipped was experimented with in 2015 and 2017.[58][59]
"Universal search" was launched by Google on May 16, 2007, as an idea that merged the results from different kinds of search types into one. Prior to Universal search, a standard Google search would consist of links only to websites. Universal search, however, incorporates a wide variety of sources, including websites, news, pictures, maps, blogs, videos, and more, all shown on the same search results page.[60][61]Marissa Mayer, then-vice president of search products and user experience, described the goal of Universal search as "we're attempting to break down the walls that traditionally separated our various search properties and integrate the vast amounts of information available into one simple set of search results.[62]
In June 2017, Google expanded its search results to cover available job listings. The data is aggregated from various major job boards and collected by analyzing company homepages. Initially only available in English, the feature aims to simplify finding jobs suitable for each user.[63][64]
In May 2009, Google announced that they would be parsing website microformats to populate search result pages with "Rich snippets". Such snippets include additional details about results, such as displaying reviews for restaurants and social media accounts for individuals.[65]
In May 2016, Google expanded on the "Rich snippets" format to offer "Rich cards", which, similarly to snippets, display more information about results, but shows them at the top of the mobile website in a swipeable carousel-like format.[66] Originally limited to movie and recipe websites in the United States only, the feature expanded to all countries globally in 2017.[67]
The Knowledge Graph is a knowledge base used by Google to enhance its search engine's results with information gathered from a variety of sources.[68] This information is presented to users in a box to the right of search results.[69] Knowledge Graph boxes were added to Google's search engine in May 2012,[68] starting in the United States, with international expansion by the end of the year.[70] The information covered by the Knowledge Graph grew significantly after launch, tripling its original size within seven months,[71] and being able to answer "roughly one-third" of the 100 billion monthly searches Google processed in May 2016.[72] The information is often used as a spoken answer in Google Assistant[73] and Google Home searches.[74] The Knowledge Graph has been criticized for providing answers without source attribution.[72]
A Google Knowledge Panel[75] is a feature integrated into Google search engine result pages, designed to present a structured overview of entities such as individuals, organizations, locations, or objects directly within the search interface. This feature leverages data from Google's Knowledge Graph,[76] a database that organizes and interconnects information about entities, enhancing the retrieval and presentation of relevant content to users.
The content within a Knowledge Panel[77] is derived from various sources, including Wikipedia and other structured databases, ensuring that the information displayed is both accurate and contextually relevant. For instance, querying a well-known public figure may trigger a Knowledge Panel displaying essential details such as biographical information, birthdate, and links to social media profiles or official websites.
The primary objective of the Google Knowledge Panel is to provide users with immediate, factual answers, reducing the need for extensive navigation across multiple web pages.
In May 2017, Google enabled a new "Personal" tab in Google Search, letting users search for content in their Google accounts' various services, including email messages from Gmail and photos from Google Photos.[78][79]
Google Discover, previously known as Google Feed, is a personalized stream of articles, videos, and other news-related content. The feed contains a "mix of cards" which show topics of interest based on users' interactions with Google, or topics they choose to follow directly.[80] Cards include, "links to news stories, YouTube videos, sports scores, recipes, and other content based on what [Google] determined you're most likely to be interested in at that particular moment."[80] Users can also tell Google they're not interested in certain topics to avoid seeing future updates.
Google Discover launched in December 2016[81] and received a major update in July 2017.[82] Another major update was released in September 2018, which renamed the app from Google Feed to Google Discover, updated the design, and adding more features.[83]
Discover can be found on a tab in the Google app and by swiping left on the home screen of certain Android devices. As of 2019, Google will not allow political campaigns worldwide to target their advertisement to people to make them vote.[84]
AI Overviews result for "What is Wikipedia", generated on 2 March 2026
At the 2023 Google I/O event in May, Google unveiled Search Generative Experience (SGE), an experimental feature in Google Search available through Google Labs which produces AI-generated summaries in response to search prompts.[85] This was part of Google's wider efforts to counter the unprecedented rise of generative AI technology, ushered by OpenAI's launch of ChatGPT, which sent Google executives to a panic due to its potential threat to Google Search.[86] Google added the ability to generate images in October.[87] At I/O in 2024, the feature was upgraded and renamed AI Overviews.[88]
AI Overviews was rolled out to users in the United States in May 2024.[88] The feature faced public criticism in the first weeks of its rollout after errors from the tool went viral online. These included results suggesting users add glue to pizza or eat rocks,[89] or incorrectly claiming Barack Obama is Muslim.[90] Google described these viral errors as "isolated examples", maintaining that most AI Overviews provide accurate information.[89][91] Two weeks after the rollout of AI Overviews, Google made technical changes and scaled back the feature, pausing its use for some health-related queries and limiting its reliance on social media posts.[92]Scientific American has criticised the system on environmental grounds, as such a search uses 30 times more energy than a conventional one.[93] It has also been criticized for condensing information from various sources, making it less likely for people to view full articles and websites. When it was announced in May 2024, Danielle Coffey, CEO of the News/Media Alliance was quoted as saying "This will be catastrophic to our traffic, as marketed by Google to further satisfy user queries, leaving even less incentive to click through so that we can monetize our content."[94]
In August 2024, AI Overviews were rolled out in the UK, India, Japan, Indonesia, Mexico and Brazil, with local language support.[95] On October 28, 2024, AI Overviews was rolled out to 100 more countries, including Australia and New Zealand.[96]
In March 2025, Google introduced an experimental "AI Mode" within its Search platform, enabling users to input complex, multi-part queries and receive comprehensive, AI-generated responses. This feature uses Google's advanced Gemini 2.0 model, which enhances the system's reasoning capabilities and supports multimodal inputs, including text, images, and voice.
Initially, AI Mode was available to Google One AI Premium subscribers in the United States, who could access it through the Search Labs platform. This phased rollout allowed Google to gather user feedback and refine the feature before a broader release.
In late June 2011, Google introduced a new look to the Google homepage in order to boost the use of the Google+ social tools.[97]
One of the major changes was replacing the classic navigation bar with a black one. Google's digital creative director Chris Wiggins explains: "We're working on a project to bring you a new and improved Google experience, and over the next few months, you'll continue to see more updates to our look and feel."[98] The new navigation bar has been negatively received by a vocal minority.[99]
In November 2013, Google started testing yellow labels for advertisements displayed in search results, to improve user experience. The new labels, highlighted in yellow color, and aligned to the left of each sponsored link help users differentiate between organic and sponsored results.[100]
On December 15, 2016, Google rolled out a new desktop search interface that mimics their modular mobile user interface. The mobile design consists of a tabular design that highlights search features in boxes and works by imitating the desktop Knowledge Graph real estate, which appears in the right-hand rail of the search engine result page, these featured elements frequently feature Twitter carousels, People Also Search For, and Top Stories (vertical and horizontal design) modules. The Local Pack and Answer Box were two of the original features of the Google SERP that were primarily showcased in this manner, but this new layout creates a previously unseen level of design consistency for Google results.[101]
Google offers a "Google Search" mobile app for Android and iOS devices.[102] The mobile apps exclusively feature Google Discover and a "Collections" feature, in which the user can save for later perusal any type of search result like images, bookmarks or map locations into groups.[103] Android devices were introduced to a preview of the feed, perceived as related to Google Now, in December 2016,[104] while it was made official on both Android and iOS in July 2017.[105][106]
In April 2016, Google updated its Search app on Android to feature "Trends"; search queries gaining popularity appeared in the autocomplete box along with normal query autocompletion.[107] The update received significant backlash, due to encouraging search queries unrelated to users' interests or intentions, prompting the company to issue an update with an opt-out option.[108] In September 2017, the Google Search app on iOS was updated to feature the same functionality.[109]
In December 2017, Google released "Google Go", an app designed to enable use of Google Search on physically smaller and lower-spec devices in multiple languages. A Google blog post about designing "India-first" products and features explains that it is "tailor-made for the millions of people in [India and Indonesia] coming online for the first time".[110]
A definition link is provided for many search terms.
Google Search consists of a series of localized websites. The largest of those, the google.com site, is the top most-visited website in the world.[111] Some of its features include a definition link for most searches including dictionary words, the number of results you got on your search, links to other searches (e.g. for words that Google believes to be misspelled, it provides a link to the search results using its proposed spelling), the ability to filter results to a date range,[112] and many more.
Google search accepts queries as normal text, as well as individual keywords.[113] It automatically corrects apparent misspellings by default (while offering to use the original spelling as a selectable alternative), and provides the same results regardless of capitalization.[113] For more customized results, one can use a wide variety of operators, including, but not limited to:[114][115]
OR or | – Search for webpages containing one of two similar queries, such as marathon OR race
AND – Search for webpages containing two similar queries, such as marathon AND runner
- (minus sign) – Exclude a word or a phrase, so that "apple -tree" searches where word "tree" is not used
-ai – Exclude AI-generated content from webpages, including Google AI Overviews.
"" – Force inclusion of a word or a phrase, such as "tallest building"
* – Placeholder symbol allowing for any substitute words in the context of the query, such as "largest * in the world"
.. – Search within a range of numbers, such as "camera $50..$100"
site: – Search within a specific website, such as "site:youtube.com"
-site: – Do not display search results from within a specific website, such as "-site:youtube.com"
define: – Search for definitions for a word or phrase, such as "define:phrase"
stocks: – See the stock price of investments, such as "stocks:googl"
related: – Find web pages related to specific URL addresses, such as "related:www.wikipedia.org"
( ) – Group operators and searches, such as (marathon OR race) AND shoes
filetype: or ext: – Search for specific file types, such as filetype:gif
before: – Search for before a specific date, such as spacex before:2020-08-11
after: – Search for after a specific date, such as iphone after:2007-06-29
@ – Search for the search term on a specific social media site, such as "@twitter"
Google also offers a Google Advanced Search page with a web interface to access the advanced features without needing to remember the special operators.[116]
As of 2024 Google no longer displays cached versions of webpages. Previously the command cache: would present a cached version of a webpage with the search term highlighted, e.g. "cache:www.google.com xxx" showed cached content with word "xxx" highlighted.[117]
Unlike other search engines, when searching for exact phrases, Google Search only takes words that are on the same line into account.
Google applies query expansion to submitted search queries, using techniques to deliver results that it considers "smarter" than the query users actually submitted. This technique involves several steps, including:[118]
Word stemming – Certain words can be reduced so other, similar terms, are also found in results, so that "translator" can also search for "translation"
Acronyms – Searching for abbreviations can also return results about the name in its full length, so that "NATO" can show results for "North Atlantic Treaty Organization"
Misspellings – Google will often suggest correct spellings for misspelled words
Synonyms – In most cases where a word is incorrectly used in a phrase or sentence, Google search will show results based on the correct synonym
Translations – The search engine can, in some instances, suggest results for specific words in a different language
Ignoring words – When search queries contain extraneous or insignificant words, Google search may drop those specific words from the query
Location sensitivity – Google may consider users' geographical location to deliver more relevant results.
A screenshot of suggestions by Google Search when "wikip" is typed
In 2008, Google started to give users autocompletedsearch suggestions in a list below the search bar while typing, originally with the approximate result count previewed for each listed search suggestion.[119]
"I'm Feeling Lucky" redirects here. For the 2011 book by Douglas Edwards, see I'm Feeling Lucky (book).
Google's homepage used to show a button labeled "I'm Feeling Lucky". This feature originally allowed users to type in their search query, click the button and be taken directly to the first result, bypassing the search results page. Clicking it while leaving the search box empty opens Google's archive of Doodles.[120] With the 2010 announcement of Google Instant, an automatic feature that immediately displays relevant results as users are typing in their query, the "I'm Feeling Lucky" button disappears, requiring that users opt-out of Instant results through search settings to keep using the "I'm Feeling Lucky" functionality.[121] In 2012, "I'm Feeling Lucky" was changed to serve as an advertisement for Google services; users hover their computer mouse over the button, it spins and shows an emotion ("I'm Feeling Puzzled" or "I'm Feeling Trendy", for instance), and, when clicked, takes users to a Google service related to that emotion.[122]
Tom Chavez of "Rapt", a firm helping to determine a website's advertising worth, estimated in 2007 that Google lost $110 million in revenue per year due to use of the button, which bypasses the advertisements found on the search results page.[123]
Besides the main text-based search-engine function of Google search, it also offers multiple quick, interactive features. These include, but are not limited to:[124][125][126]
During Google's developer conference, Google I/O, in May 2013, the company announced that users on Google Chrome and ChromeOS would be able to have the browser initiate an audio-based search by saying "OK Google", with no button presses required. After having the answer presented, users can follow up with additional, contextual questions; an example include initially asking "OK Google, will it be sunny in Santa Cruz this weekend?", hearing a spoken answer, and reply with "how far is it from here?"[127][128] An update to the Chrome browser with voice-search functionality rolled out a week later, though it required a button press on a microphone icon rather than "OK Google" voice activation.[129] Google released a browser extension for the Chrome browser, named with a "beta" tag for unfinished development, shortly thereafter.[130] In May 2014, the company officially added "OK Google" into the browser itself;[131] they removed it in October 2015, citing low usage, though the microphone icon for activation remained available.[132] In May 2016, 20% of search queries on mobile devices were done through voice.[133]
In addition to its tool for searching web pages, Google also provides services for searching images (Google Images), Usenetnewsgroups, news websites, videos (Google Videos), searching by locality, maps, and items for sale online. Google Videos allows searching the World Wide Web for video clips.[134] The service evolved from Google Video, Google's discontinued video hosting service that also allowed to search the web for video clips.[134]
There are also products available from Google that are not directly search-related. Gmail, for example, is a webmail application, but still includes search features; Google Browser Sync does not offer any search facilities, although it aims to organize your browsing time.
In 2009, Google claimed that a search query requires altogether about 1 kJ or 0.0003 kW·h,[137] which is enough to raise the temperature of one liter of water by 0.24 °C. According to green search engine Ecosia, the industry standard for search engines is estimated to be about 0.2 grams of CO2 emission per search.[138] Google's 40,000 searches per second translate to 8 kg CO2 per second or over 252 million kilos of CO2 per year.[139]
On certain occasions, the logo on Google's webpage will change to a special version, known as a "Google Doodle". This is a picture, drawing, animation, or interactive game that includes the logo. It is usually done for a special event or day although not all of them are well known.[140] Clicking on the Doodle links to a string of Google search results about the topic. The first was a reference to the Burning Man Festival in 1998,[141][142] and others have been produced for the birthdays of notable people like Albert Einstein, historical events like the interlocking Lego block's 50th anniversary and holidays like Valentine's Day.[143] Some Google Doodles have interactivity beyond a simple search, such as the famous "Google Pac-Man" version that appeared on May 21, 2010.
In 2012, the US Federal Trade Commission fined Google US$22.5 million for violating their agreement not to violate the privacy of users of Apple's Safari web browser.[144] The FTC was also continuing to investigate if Google's favoring of their own services in their search results violated antitrust regulations.[145]
Since 2012, Google Inc. has globally introduced encrypted connections for most of its clients to bypass governative blockings of commercial and IT services.[146]
Google has been criticized for placing long-term cookies on users' machines to store preferences, a tactic which also enables them to track a user's search terms and retain the data for more than a year.[147] Google searches have also triggered keyword warrants and geofence warrants in which information is shared with law enforcement, leading to a criminal case.[148] Investigators can request Google to disclose everyone who searched a keyword or query or every phone in a particular place at a certain time.[149] In 2023, the Colorado Supreme Court upheld the use of search history requests to identify suspects in a 2020 arson, later stating that it is not a "broad proclamation" and noted that the warrant did not have an individualized probable cause.[150]
A 2019 New York Times article on Google Search showed that images of child sexual abuse had been found on Google and that the company had been reluctant at times to remove them.[154]
Google flags search results with the message "This site may harm your computer" if the site is known to install malicious software in the background or otherwise surreptitiously. For approximately 40 minutes on January 31, 2009, all search results were mistakenly classified as malware and could therefore not be clicked; instead a warning message was displayed and the user was required to enter the requested URL manually. The bug was caused by human error.[155][156][157][158] The URL of "/" (which expands to all URLs) was mistakenly added to the malware patterns file.[156][157]
In 2007, a group of researchers observed a tendency for users to rely exclusively on Google Search for finding information, writing that "With the Google interface the user gets the impression that the search results imply a kind of totality. In fact, one only sees a small part of what one could see if one also integrates other research tools."[159]
In 2011, Google Search query results have been shown by Internet activist Eli Pariser to be tailored to users, effectively isolating users in what he defined as a filter bubble. Pariser holds algorithms used in search engines such as Google Search responsible for catering "a personal ecosystem of information".[160] Although contrasting views have mitigated the potential threat of "informational dystopia" and questioned the scientific nature of Pariser's claims,[161] filter bubbles have been mentioned to account for the surprising results of the U.S. presidential election in 2016 alongside fake news and echo chambers, suggesting that Facebook and Google have designed personalized online realities in which "we only see and hear what we like".[162]
In a November 2023 disclosure, during the ongoing antitrust trial against Google, an economics professor at the University of Chicago revealed that Google pays Apple 36% of all search advertising revenue generated when users access Google through the Safari browser. This revelation reportedly caused Google's lead attorney to cringe visibly.[165] The revenue generated from Safari users has been kept confidential, but the 36% figure suggests that it is likely in the tens of billions of dollars.
Both Apple and Google have argued that disclosing the specific terms of their search default agreement would harm their competitive positions. However, the court ruled that the information was relevant to the antitrust case and ordered its disclosure. This revelation has raised concerns about the dominance of Google in the search engine market and the potential anticompetitive effects of its agreements with Apple.[166]
Google search engine robots are programmed to use algorithms that understand and predict human behavior. The book, Race After Technology: Abolitionist Tools for the New Jim Code[167] by Ruha Benjamin talks about human bias as a behavior that the Google search engine can recognize. In 2016, some users Google searched "three Black teenagers" and images of criminal mugshots of young African American teenagers came up. Then, the users searched "three White teenagers" and were presented with photos of smiling, happy teenagers. They also searched for "three Asian teenagers", and very revealing photos of Asian girls and women appeared. Benjamin concluded that these results reflect human prejudice and views on different ethnic groups. A group of analysts explained the concept of a racist computer program: "The idea here is that computers, unlike people, can't be racist but we're increasingly learning that they do in fact take after their makers ... Some experts believe that this problem might stem from the hidden biases in the massive piles of data that the algorithms process as they learn to recognize patterns ... reproducing our worst values".[167]
On August 5, 2024, Google lost a lawsuit which started in 2020 in D.C. Circuit Court, with Judge Amit Mehta finding that the company had an illegal monopoly over Internet search.[168] This monopoly was held to be in violation of Section 2 of the Sherman Act.[169] Google has said it will appeal the ruling,[170] though they did propose to loosen search deals with Apple and others requiring them to set Google as the default search engine.[171]
As people talk about "googling" rather than searching, the company has taken some steps to defend its trademark, in an effort to prevent it from becoming a generic trademark.[172][173] This has led to lawsuits, threats of lawsuits, and the use of euphemisms, such as calling Google Search a famous web search engine.[174]
Until May 2013, Google Search had offered a feature to translate search queries into other languages. A Google spokesperson told Search Engine Land that "Removing features is always tough, but we do think very hard about each decision and its implications for our users. Unfortunately, this feature never saw much pick up".[175]
Instant search was announced in September 2010 as a feature that displayed suggested results while the user typed in their search query, initially only in select countries or to registered users.[176] The primary advantage of the new system was its ability to save time, with Marissa Mayer, then-vice president of search products and user experience, proclaiming that the feature would save 2–5 seconds per search, elaborating that "That may not seem like a lot at first, but it adds up. With Google Instant, we estimate that we'll save our users 11 hours with each passing second!"[177] Matt Van Wagner of Search Engine Land wrote that "Personally, I kind of like Google Instant and I think it represents a natural evolution in the way search works", and also praised Google's efforts in public relations, writing that "With just a press conference and a few well-placed interviews, Google has parlayed this relatively minor speed improvement into an attention-grabbing front-page news story".[178] The upgrade also became notable for the company switching Google Search's underlying technology from HTML to AJAX.[179]
Instant Search could be disabled via Google's "preferences" menu for those who didn't want its functionality.[180]
The publication 2600: The Hacker Quarterly compiled a list of words that Google Instant did not show suggested results for, with a Google spokesperson giving the following statement to Mashable:[181]
There are several reasons you may not be seeing search queries for a particular topic. Among other things, we apply a narrow set of removal policies for pornography, violence, and hate speech. It's important to note that removing queries from Autocomplete is a hard problem, and not as simple as blacklisting particular terms and phrases.
In search, we get more than one billion searches each day. Because of this, we take an algorithmic approach to removals, and just like our search algorithms, these are imperfect. We will continue to work to improve our approach to removals in Autocomplete, and are listening carefully to feedback from our users.
Our algorithms look not only at specific words, but compound queries based on those words, and across all languages. So, for example, if there's a bad word in Russian, we may remove a compound word including the transliteration of the Russian word into English. We also look at the search results themselves for given queries. So, for example, if the results for a particular query seem pornographic, our algorithms may remove that query from Autocomplete, even if the query itself wouldn't otherwise violate our policies. This system is neither perfect nor instantaneous, and we will continue to work to make it better.
PC Magazine discussed the inconsistency in how some forms of the same topic are allowed; for instance, "lesbian" was blocked, while "gay" was not, and "cocaine" was blocked, while "crack" and "heroin" were not. The report further stated that seemingly normal words were also blocked due to pornographic innuendos, most notably "scat", likely due to having two completely separate contextual meanings, one for music and one for a sexual practice.[182]
On July 26, 2017, Google removed Instant results, due to a growing number of searches on mobile devices, where interaction with search, as well as screen sizes, differ significantly from a computer.[183][184]
"Instant previews" allowed previewing screenshots of search results' web pages without having to open them. Clicking on the magnifying glass beside the search-result links will show a screenshot of the web page and highlight the image's relevant text. Google said that the feature "helps people find information faster by showing a visual preview of each result." The snapshots of web pages are stored on Google's servers.[185] The feature was introduced in November 2010 to the desktop website and removed in April 2013, citing low usage.[185][186]
Various search engines provide encrypted Web search facilities. In May 2010 Google rolled out SSL-encrypted web search.[187] The encrypted search was accessed at encrypted.google.com[188] However, the web search is encrypted via Transport Layer Security (TLS) by default today, thus every search request should be automatically encrypted if TLS is supported by the web browser.[189] On its support website, Google announced that the address encrypted.google.com would be turned off April 30, 2018, stating that all Google products and most new browsers use HTTPS connections as the reason for the discontinuation.[190]
Google Real-Time Search was a feature of Google Search in which search results also sometimes included real-time information from sources such as Twitter, Facebook, blogs, and news websites.[191] The feature was introduced on December 7, 2009,[192] and went offline on July 2, 2011, after the deal with Twitter expired.[193] Real-Time Search included Facebook status updates beginning on February 24, 2010.[194] A feature similar to Real-Time Search was already available on Microsoft's Bing search engine, which showed results from Twitter and Facebook.[195] The interface for the engine showed a live, descending "river" of posts in the main region (which could be paused or resumed), while a bar chart metric of the frequency of posts containing a certain search term or hashtag was located on the right hand corner of the page above a list of most frequently reposted posts and outgoing links. Hashtag search links were also supported, as were "promoted" tweets hosted by Twitter (located persistently on top of the river) and thumbnails of retweeted image or video links.
In January 2011, geolocation links of posts were made available alongside results in Real-Time Search. In addition, posts containing syndicated or attached shortened links were made searchable by the link: query option. In July 2011, Real-Time Search became inaccessible, with the Real-Time link in the Google sidebar disappearing and a custom 404 error page generated by Google returned at its former URL. Google originally suggested that the interruption was temporary and related to the launch of Google+;[196] they subsequently announced that it was due to the expiry of a commercial arrangement with Twitter to provide access to tweets.[197]
List of search engines by popularity – Software system for finding relevant information on the WebPages displaying short descriptions of redirect targets
^Sherman, Chris; Price, Gary (May 22, 2008). "The Invisible Web: Uncovering Sources Search Engines Can't See". Illinois Digital Environment for Access to Learning and Scholarship. University of Illinois at Urbana–Champaign. hdl:2142/8528.
^Megan Gray (October 2, 2023). "How Google Alters Search Queries to Get at Your Wallet". Archived from the original on October 2, 2023. This onscreen Google slide had to do with a "semantic matching" overhaul to its SERP algorithm. When you enter a query, you might expect a search engine to incorporate synonyms into the algorithm as well as text phrase pairings in natural language processing. But this overhaul went further, actually altering queries to generate more commercial results.
^Parramore, Lynn (October 10, 2010). "The Filter Bubble". The Atlantic. Archived from the original on August 22, 2017. Retrieved April 20, 2011. Since Dec. 4, 2009, Google has been personalized for everyone. So when I had two friends this spring Google 'BP,' one of them got a set of links that was about investment opportunities in BP. The other one got information about the oil spill
^Mostafa M. El-Bermawy (November 18, 2016). "Your Filter Bubble is Destroying Democracy". Wired. Retrieved March 3, 2017. The global village that was once the internet ... digital islands of isolation that are drifting further apart each day ... your experience online grows increasingly personalized
^"Google Instant Search: The Complete User's Guide". Search Engine Land. September 8, 2010. Archived from the original on October 20, 2021. Retrieved October 5, 2021. Google Instant only works for searchers in the US or who are logged in to a Google account in selected countries outside the US