Random observation
Sunday, 28 August 2005 17:51When searching for Maltese words and finding lots of other-language sites which happen to use an identically-spelled word, it's amazing how well you can narrow down to Maltese sites by adding "hu" to the search terms (since it's a word that occurs fairly frequently in Maltese—so that chances are good that the pages containing the word(s) you were looking for also contain "hu"—but infrequently in most other languages—so they are filtered out).
For example, just now I googled for "vuci", just for the heck of it, since the word popped into my mind. Brought up tons of Czech sites containing the word vůči, whatever that means (since Google seems to tend to ignore diacritics when searching, which can be both a boon and a bane, depending on what you want), but searching for "vuci hu" (without quotes) significantly reduced the number of Czech hits.
Searching for "vuċi" would have resulted into only Maltese hits in the first place, I suppose, but there are (or seem to be) many many web sites with Maltese that just drop diacritics, which searching for that term wouldn't have found those.
no subject
Date: Sunday, 28 August 2005 19:11 (UTC)Take a look at Pass the hát (http://itre.cis.upenn.edu/~myl/languagelog/archives/002201.html): a great article on different results from a string "hat" with or without duplication and/or diacritics.