To run a background check on someone using a search engine. While a Web search is limited only to discovering a person's Web footprint, a search of a person's name might be useful in preliminary research in biz dev. Or, if you can believe Deborah Schoeneman of the New York Observer, prospective dates, when she wrote the article, "Don't Be Shy, Ladies--Google Him!" "With Googling, it's easy to find out if a new crush has ever made news, has ever been published or, on the flip side, has ever been indicted for securities fraud-or worse, posted love letters on a Backstreet Boys fan site." New York Observer, January 1, 2001. Or this example, from a skit on Harry Shearer's Le Show: Vice President Cheney: "Come on, Condi, you and I both know there's no such thing as University of Denver." Condoleezza Rice: "There is too. Google me." Cf. SharQ's note re: the Oxford English Dictionary's definition.
While a Web search is limited only to discovering a person's Web footprint, a search of a person's name might be useful in preliminary research in biz dev. Or, if you can believe Deborah Schoeneman of the New York Observer, prospective dates, when she wrote the article, "Don't Be Shy, Ladies--Google Him!" "With Googling, it's easy to find out if a new crush has ever made news, has ever been published or, on the flip side, has ever been indicted for securities fraud-or worse, posted love letters on a Backstreet Boys fan site." New York Observer, January 1, 2001. Or this example, from a skit on Harry Shearer's Le Show:
In February 12, 2001, Google also swallowed Deja, calling it Google Groups (http://groups.google.com/).
Deja was a fine service, but then they started becoming less and less usable by adding more and more ads - and they also tried to be some sort of "precision buying" site. People didn't like them anymore that much. They ended up in financial trouble. Then, eBay's half.com bought the precision buying part and Google bought the Usenet archives.
Google Groups is sort of like the original DejaNews - it has a very clean interface, just like the rest of Google. They also made a Great Cultural Act by putting the old DejaNews Usenet archives (since 1995) back online (Deja took most of them offline).
In December 11, 2001, they made the entire Usenet news archive since 1981 available! This event was a Great Cultural Act in Totally Massive Scale. =)
This is, of course, only one reason why Google is my second-most-favorite web service after E2... As mentioned in other writeups, Google web search is very, very good, and their site is very light. Also, Mozilla supports Google toolbar search "out of box", and they have a gigantic search toolbar toy for MSIE, too (imitation of which is in works for Mozilla).
For Mozilla users: The toolbar search is pretty good, but I rather use the bookmark feature. Basically, I added a bookmark with URL http://www.google.gon/search?q=%s and put "g" in the Keyword. This way, I can search for things by typing "g searchterms" in the toolbar... Why this? Well, I also added a bookmark http://www.everything2.com/?node=%s with keyword "e2", and added similar bookmarks for dictionary.com, IMDb and others. I can search from my favorite websites easily =)
google v.
[common] To search the Web using the Google search engine, http://www.google.com. Google is highly esteemed among hackers for its significance ranking system, which is so uncannily effective that many hackers consider it to have rendered other search engines effectively irrelevant. The name `google' has additional flavor for hackers because most know that it was copied from a mathematical term for ten to the hundredth power, famously first uttered as `googol' by a mathematician's nine-year-old nephew.
--The Jargon File version 4.3.1, ed. ESR, autonoded by rescdsk.
Good Thing = G = gopher (this node is not part of the Jargon file, but the above is included in order not to break its continuity, as, frankly, the Jargon file entry was a bit crap.)
Google was born in 1995, when Sergey Brin (23) and Larry Page (24), Computer Ph.D. comp.sci candidates meet. They work together to start developing "A little something" that would make it easier to search the "Internet" - a global networking technology that had been unleashed on the world only a year before.
The name Google is a wordplay on "Googol", which is one of the largest numbers we have a name for, namely 10100. Never mind that the googolplex (10googol)has been invented :).
Between 1996 - 1999, the technology slowly evolves, driven forward from being a two-server operation ran out of Sergey and Larry's dorm rooms, to becoming incorporated in 1998. At this time, Google got 10,000 searches every day, and PC Magazine includes the site as one of the top 100 web sites.
In 1999, all hell breaks loose - the web traffic worldwide grows exponentally, and Google handles half a million searches every day. AOL/Netscape include Google into their Netcenter, which really gets the ball rolling.
By the end of 1999, Google handles three million searches every day, and has nearly 40 employees.
From 2000, these are the major parts of Google's development;
2000:
2001
2002
Nope.
Google performs more than 150 million searches every day, and the technology needed to keep something like that running is truly amazing.
Google Enterprises runs the world's largest cluster of Linux servers; More than 10,000 servers (Yes, you read this correctly. Ten thousand.).
When you search, the query is routed through a set of load balanced, mirrored index servers. When a page hit is made, the results are pushed through a document server (this is where all the cached data is stored), before it eventually is showed on your web browser.
Did I say eventually? Oh, sorry. I meant to say "within half a second".
Per May 2002, Google contains more than 700 million Usenet messages (that is more than a terabyte of data) - most of what has been said in newsgroups the past 20 years. It indexes more than 2 billion web pages, and has hundreds of servers dedicated to crawling the web for new information.
Google also offers wireless search technologies, such as a Wap Portal, i-mode portal and other, more exotic technologies, such as a voice activated portal (made in cooperation with BMW, for use in automobiles), et cetera.
Google's Pagerank technology is unique, and by far the most important part of the Google technology - it "performs an objective measurement of the importance of web pages and is calculated by solving an equation of 500 million variables and more than 2 billion terms. PageRank uses the vast link structure of the web as an organizational tool. In essence, Google interprets a link from Page A to Page B as a 'vote' by Page A for Page B. Google assesses a page's importance by the votes it receives.
Google also analyzes the pages that cast the votes. Votes cast by pages that are themselves 'important' weigh more heavily and help to make other pages important." (source: The google press center). This technology means that if you enter "microsoft", Microsofts website is likely to come up on top, increasing the chances of you finding what you need. (bad example - why would you need Google to find Microsoft? Anyway, you know what I mean)
But google is always continuing their research... New features include:
But it doesn't stop there - the Google Labs site (labs.google.com) is a showcase of all sorts of new search engine goodies, currently in Alpha or Beta status, which may or may not be released in the future:
Try them yourself: All functions that are out of beta are available on http://www.google.com/options/ All functions that are still in beta are available on http://labs.google.com/
The word "Google" has had a far larger impact on our language than any other name of an internet site. For one thing, it is one of the very few site names that has been accepted into the Oxford English Dictionary as a word: "To Google" is synonymous with searching the internet - even if you intend to use a different search engine than the Google engine itself. (Granted "to Slashdot" and "Slashdotted" are also two words that have taken on two other meanings than the original meaning (the name of the Slashdot site), but these are largely geekspeak, rather than common-use words.)
I am currently researching the exact topography and specs for Google's servers, but this might take a while, as their press department isn't too effective.
Recently, trends in domain name registration have been surprising. For the first time in the history of the Internet, there has been a statistically significant reduction in the number of registered domain names. Many doomsayers proclaim that the reason such a trend occurred is the popping of the Internet bubble, the death of E-Commerce, the burning of Silicon Valley, the Y2K bug, the Four Horsemen of the Apocalypse. Such is not the case; to the contrary, there is one obvious over-arching cause that does not suggest the imminent or recent doom of the online world, but rather its gradual redemption.
That cause is Google. Never before has there been such an effective and seemingly foolproof search engine on the Internet. Webmasters no longer find it necessary to register every possible mutation or misspelling of the site's name with Internic. Previously, in the ages of less-effective search engines, simply typing "www.some_company_name.com" in as a URL and crossing your fingers was not that much less viable than taking the trouble to sort through thousands of false Yahoo results to find the true company site on the eighteenth page of results.
Now, however, the most prominent search engine on the Web, Google, is eerily accurate at knowing what you want and where to find it. It is no longer necessary or even advisable to register extraneous domain names in the hopes of capturing a shot-in-the-dark visitor. Doing so means dividing other peoples' links to you among all of your different DNS entries, which, due to the mechanics of Google's PageRank algorithm, translates to a counter-intuitive reduction in your site's priority on Google. Because the number of users who use Google to find a given site far outweighs the number who use the out-of-date aforementioned shot-in-the-dark method, it is wise to choose to cater to the Google crowd. Google has succeeded in superimposing the best interests of a site's owner with the best interests of the public, and the result is a cleaner, more efficient Internet.
Unquestionably, the internet search engine Google (http://www.google.com) provides effective and fast searching for whatever topic that you wish, scouring the world wide web and usenet for very effective results to your searches.
However, most people are not aware of the enormous privacy violation that goes along with using Google. Google, in fact, stores a great deal of personal information about you whenever you use the site and utilizes this information to alter search results that you get.
The Google Cookie Use the options for your web browser right now and take a look at the cookies stored on your hard drive. If you're allowing cookies (and you are if you're not sure), then Google is actively gathering information about you when you use the site.
Don't believe me? Here's how to find the Google cookie on your own machine:
Internet Explorer: Choose "Internet Options" from the "Tools" menu. When the window pops up, click on the "Settings..." button in the "Temporary Internet Files" section. On the next screen, click on the "View Files" button. You'll have a window pop up containing (likely) a long list of files. The file you're looking for should be called "Cookie:{somename}@google.txt".
Mozilla/Netscape/Phoenix: Choose "Preferences" from the "Edit" menu (or "Tools" menu in some versions). Once there, click on "Privacy" then on the "Manage Cookies..." button that appears. Click on the one from google.com.
If you look at the cookie, you'll see that it contains a string that looks like this:
ID=8b1353970dfc234c:TM=1423423345:LM=4323256345:S=iq4Ca68bAn8
This string contains a unique identifier (the ID= part) that identifies you to Google. Thus, every Google search you do is matched specifically to you in their database.
Even more frightening is the fact that the cookie does not expire until Sunday, January 17, 2038, at 1:00:00 PM (or so). This means that until that date (unless you actively remove the cookie), Google will be tracking all of your searches.
How This Violates Your Privacy In conjunction with the cookie, Google can quite easily store in their database your exact location, your computer's identification, what you searched for, and when you searched for it.
Note: Google can easily do this. The cookie information described above, along with the information your browser sends every time you make a request, sends along all this information to Google. All they have to do is store it.
This has two major implications.
In other words, Google utilizes your personal information without your permission to turn the site into a marketing tool geared specifically to the information that they took from you.
Longer Term Implications Some long term implications of this data warehousing include:
No Data Retention Policy On top of this, Google's privacy policy is severely lacking in one major area: there is no stated data retention policy. This means that Google can store years upon years of data on individual people, detailing their interests, thoughts, hobbies, and other piccadilloes, and they don't even bother to directly state what exactly they are storing about you. Their "privacy policy" isn't a privacy policy at all; it's merely a PR tool.
Evidence for the claims stated here and above can all be read clearly at Google's very own privacy policy, found at http://www.google.com/privacy.html. Some highlights:
Upon your first visit to Google, Google sends a "cookie" to your computer. A cookie is a piece of data that identifies you as a unique user.(Note: a cookie is NOT a piece of data that identifies you as a unique user) Google notes and saves information such as time of day, browser type, browser language, and IP address with each query. That information is used to verify our records and to provide more relevant services to users. Google may share information about you with advertisers, business partners, sponsors, and other third parties. However, we only divulge aggregate information about our users and will not share personally identifiable information with any third party without your express consent. For example, we may disclose how frequently the average Google user visits Google, or which other query words are most often used with the query word "Linux." Please be aware, however, that we will release specific personal information about you if required to do so in order to comply with any valid legal process such as a search warrant, subpoena, statute, or court order.(Note: Read this part carefully. Google admits to detailed profiling of individual users in this paragraph)
The Google Toolbar The Google toolbar (which many users have added to their web browser) goes even farther beyond Google itself into invading your privacy. Using the individual profile that Google has developed (and is described above), the Google toolbar actually reports every page that you visit to Google, adding to their profile about you. In other words, every page you visit and link you click while using the Google toolbar contributes to your stored profile at Google.
Why is this scary? Let's say I wanted to read up on philosophies of technology. I had to take a course on this particular topic during my studies, so it is not an unusual expectation. Now, if I spent a few afternoons reading such documents as The Unabomber's Manifesto or some detailed documents on the Luddite movement. I used Google to find these pages, of course, and made many detailed searches while doing it. After a while, however, the government becomes suspicious of terrorist activity in my local area, so they subpoena Google's databases and retrieve a list of IP addresses in my area that have been reading such inflammatory material. Suddenly, I am in deep legal trouble.
For full details, read the Google toolbar privacy policy at http://toolbar.google.com/privacy.html.
If That Doesn't Concern You, Then Read This... When the New York Times (feel free to look this up in the November 28, 2002 issue) asked Google head honcho Sergey Brin about whether Google ever gets subpoenaed for its stored information on users and their searching patterns and history, he had no comment. In other words, he refused to deny that Google is willing and able to hand over the search histories of individuals to law enforcement agencies when they request it. In the above section detailing Google's privacy policy, they do in fact state that they are willing to turn over your information to authorities.
Given the broad sweep of the Office of Homeland Security in terms of the ease of gaining subpoenas for investigating supposed acts of terror, I've found myself being very careful what I search for on Google.
You are advised to do the same.
I'm not going to comment on Google's alleged privacy violations except to say that I've said my piece elsewhere. Any opinions expressed in this writeup are my own, and do not necessarily represent the opinion of my employer.
I've learned an awful lot about Google over the past three years. My senior year of college, I worked on a project for a company doing research into link analysis algorithms, including among other things, the paper on PageRank. It's a common myth that the "Page" in PageRank refers to "web pages"; it's actually a reference to the creator (Larry Page).
I use Google to search for information about just about everything these days. It's amazingly handy for doing homework, and if you're in the software engineering business, it's often easier to do a Google search for error messages or function definitions than it is to remember which books document the feature you're looking for. There have been plenty of articles published recently about how people are searching for their date's names in Google.
Unfortunately, there are many Google features people just don't know about. Some of these are more or less obvious: Google Labs has a few interesting demos on it, and the tabs that appear at the top of every page (web, images, groups, directory, news) can point you to searches over different sorts of information. There are also a couple of interesting features that you won't really notice unless you visit the advanced search page often. (Many new services get quietly launched on the advanced search page: Froogle, Google Catalogs, and various OS specific and university specific searches can be found here.) There are even some services that are more difficult to find: Google Answers isn't very well publicized at all.
But it's annoying to have to go to another page to find the "right place" to do a search. Besides, if I want to search for something that's related to Linux, chances are a regular Google search will bring it up. The really interesting features are the ones that just automatically happen in the regular search box:
Founded September 1998 by Sergey Brin and Larry Page, Google is billed as the world's premier search engine. With an index of over 4 billion web pages, their Pagerank algorithm agreed to be the best around, and a brand that is almost unbeatable in the online world, Google's engine is the first choice of most web-searchers. The business model is the envy of most dotcom businesses with revenues exceeding $1 billion, and likely to continue to increase. The current income is almost $1 per share, and with a current tentative stock price set between $100 and $150, it will have one of the best P/E ratio of any dotcom so far.
The company is doing something particularly interesting with their IPO; they are making it public. Anyone who wants, and has a computer to check it out, can go to ipo.google.com, and register to participate in an auction of shares, where all of the shares will be distributed. The price that Google lists as the selling price for the shares is between $108 and $140, but they will base the final price and allocation on some formula they declined to disclose, (and given the fact that they have teams of statisticians working for them, will probably be pretty well thought out).
The only guarantees are that they will sell all the stock they list, about 24 million class A shares, with a total of 34 million shares outstanding, and 230 million class B shares outstanding, not counting about 12 thousand shares available through stock options. The current stockholders will be selling a total of 10 million shares, at a total value of over $1 billion, and the company will be selling an additional 14 million shares, valued at over $1.5 billion, for as-yet undetermined expenses and acquisitions.
Google currently owns Pyra Labs, which runs blogger, and runs gmail, possibly the hottest account online. They bought Dejanews and transformed it into Google Groups, and runs services from Froogle to image searches to a news indexing service. They currently run hundreds of millions (yes, 10^8) of web searches a day.
Sources: www.google.comipo.google.com
Google's early and strong commitment to their users and informal corporate mission to "Do No Evil" or to "Don't be evil" has gained them wide trust. While their web search technology is well known, Google specializes in general data indexing. As a trusted and centralized source of information on the web, Google has assumed immeasurable power. Now as a publicly traded corporation with over three thousand employees, Google must assume a great responsibility to their users, to deliver fair and relevant results and to protect privacy at all costs. As their user base grows and their popularity increases, decisions at Google will not come without ethical consequences.
The node ranking technology, dubbed PageRank, is used in some form to rank web pages, news articles, images, and user documents. It appears at first glance to be a democratic system, but it is a flawed one. Under most democratic systems, votes are weighted equally. PageRank, however, is mathematically inclined to give more power to relevant pages. To Google's credit, since web pages frequently contain more than one link and thus vote more than once, the page's total importance is at least disseminated among its links. In spite of Google's patents giving away much of the ranking method, certain variables and factors remain a secret and draw skepticism to the impartiality of the ranking scheme. Further, it is known that scrubbing mechanisms other than PageRank prepare the search results for their final display on Google's web page.
Among the further scrubbing methods of search results, one that I found most surprising is that Google censors search results in China, France, and Germany. While I was aware that the targeted material is illegal in those countries, I had previously interpreted Google's "Do No Evil" attitude to include objecting to authoritarian one party states and Third Reich-inspired censorship. It would be virtuous of Google and would further general human knowledge if they were to insist on the mass dissemination of information to all people of the world. Google is free to run their web search from the United States and freely put material on the web outside of the jurisdiction of China, France, or Germany, and in fact their Chinese operation is based in the United States. In the case of China, however, practicality won out over virtue when the Chinese censors completely banned Google. If the search engine was to have any Chinese user base whatsoever, in the interest of their ultimate goal of putting the user first, they needed to comply. While no concrete information is available on how much Google colluded with the Chinese, they did comply with the censors, restoring access to the Chinese citizens.
When not legally obliged to censor results, Google strives for impartiality but leave many questions unanswered. In a letter titled "An explanation of our search results," Google explains why offensive results can occur for seemingly inoffensive search terms, how "search results are generated completely objectively and are independent of the beliefs and preferences of those who work at Google." Further, the letter recognizes petitions that requested the removal of hate sites, but Google notes that they only omit sites they are "legally compelled to remove or those maliciously attempting to manipulate [their] results." These claims appear promising and it is quite believable that Google does not omit results, but it is commonly known that certain factors of exact rankings are still hidden from the public.
Google officials recently leaked to the public that they had an internal ethics committee that is periodically in charge of altering the PageRank algorithm. I expected privacy and ethics employees to be found at a company of Google's size, but what I didn't expect to find is that the committee is merely an informal gathering of employees interested in ethics. To ensure impartiality and to form trust in Google's ethical decisions, they need an official ethics committee with training, experience, and direction. Without public statements from Google's official ethics committee as to how they change the algorithm and what criteria they follow in doing so, the impartiality of PageRank is under suspicion.
When complex technology delivers trusted results, as with government cryptographic standards, complete transparency can often be an ideal way to ensure impartiality. In the case of Google, however, complete transparency has immediate downfalls. Because achieving genuine relevancy is difficult and cheating the system is considerably easier, PageRank has become a dynamic target for malicious webmasters who consistently overcome the latest algorithm tweaks to achieve high rankings for irrelevant pages. Delivering usable search results consists not only of identifying the relevant pages to place near the top of the rankings, but also of identifying junk or spam pages in order to place those results near the bottom. With these facts in place, I learned an important lesson: complete transparency of Google's ranking methods would result in direct widespread manipulation, rendering the results irrelevant and denying the technology its usefulness.
If the PageRank specifications were to somehow assume complete and deserved trust of their impartiality from the public, I would still hold several issues with their nature. First, PageRank likens the web to a popularity contest. Resembling a student body election, Google is often guilty of burying the most relevant and useful results simply because no one has noticed them yet. Second, if ideological bias does exist in the news media, then I consider PageRank to be culpable in perpetuating it. Many theories of media bias are based around the concept of an echo chamber of unpopular but powerfully backed opinion that drowns out an unbiased or otherwise popular opinion. PageRank is perfectly suited for creating such an echo chamber, in that when a news search returns hundreds of pages, most Google users read the first few and do not ensure that the other 99% corroborate.
Google has become a substantial news source and plays a part in telling the web which stories are relevant, when in fact the web should be telling Google what is relevant. The same issue holds for normal web searches; when Google was a relatively unknown outside observer to the social interactions on the web, PageRank was fresh and promising. As Google slowly begins arbitrating these interactions by dictating what is relevant to trusting masses, PageRank could become stale and merely project importance on its own monstrous creations.
For many of these issues, the time for true remedies is over. Google cannot go back to a 50 employee company and embrace a proper ethics committee, and they cannot go back to a Stanford dorm room and better democratize PageRank technology. Worse yet, Google will likely never be a non-profit organization with only its users truly in mind; the responsibility of a publicly traded company like Google is to its stakeholders. The company is faced with the challenging task of gaining public trust in its technologies through transparency or other means, while skeptical users are faced with an even more difficult choice: whether to boycott the most useful search technology in history.
BBC News. "10 things the Google ethics committee could discuss." May 20, 2004. http://news.bbc.co.uk/1/hi/magazine/3732475.stm
Brandt, Daniel. "PageRank: Google's Original Sin." Google Watch, Public Information Research, Inc. August 2002. http://www.google-watch.org/pagerank.html
Elgin, Ben. "Google's Chinese Wall." BusinessWeek Online, The McGraw-Hill Companies Inc. September 30, 2004. http://www.businessweek.com/bwdaily/dnflash/sep2004/ nf20040930_3318_db046.htm
Garfinkel, Simson. Database Nation: The Death of Privacy in the 21st Century. O'Reilly and Associates, Inc. 2001.
Google. "An explanation of our search results." 2004. http://www.google.com/explanation.html
Google. "Google Code of Conduct." August 18, 2004. http://investor.google.com/conduct.html
Google. "Media Coverage." http://www.google.com/press/press.html
Newton, Jon. "Google and the Chinese Government." TechNewsWorld, September 22, 2004 6:00 AM PT. http://www.technewsworld.com/story/36818.html
Orlowski, Andrew. "Google's Ethics Committee Revealed." The Register. May 17, 2004. http://www.theregister.co.uk/2004/05/17/google_ethics_committee/
Orlowski, Andrew. "Google values its own privacy. How does it value yours?" The Register. April 13, 2004. http://www.theregister.co.uk/2004/04/13/asymmetric_privacy/
Rogers, Ian. "The Google PageRank Algorithm and How It Works." IPR Computing Ltd. http://www.iprcom.com/papers/pagerank/index.html
Våge, Lars. "China's search engine censorship continues." InternetBrus. February 27, 2005. http://www.pandia.com/sw-2005/09-china.html
Wikipedia. "Google." May 27, 2005 http://en.wikipedia.org/wiki/Google
Xia, Bill. "Google Chinese News censorship demonstrated." Dynamic Internet Technology Inc. September 16, 2004. http://www.dit-inc.us/report/google200409/google.htm
Zittrain, Jonathan and Edelman, Benjamin. "Localized Google search result exclusions - Statement of issues and call for Data." Berkman Center for Internet & Society, Harvard Law School. October 26, 2002. http://cyber.law.harvard.edu/filtering/google/
printable version chaos
Everything2 Help
cooled by borgo