| Author |
Message |
| yacysearch.i2p - eepsite search engine |
|
Posted:
Sat Apr 18, 2009 8:47 am
|
|
|
Joined: 18 Apr 2009
Posts: 10
|
|
I have startet a new eepsite searchengine with yacy
| I2P Key - (paste this line into your I2P hosts.txt file) | | yacysearch.i2p=c9VPGODJnkWNmyBccEW6XmYoXRv8UZevD3B6YmX4owOs8gQwKPlYX1Bn3yDiYHurUaRb~vJDDXADlv~S3ZaI4Z4~P82RAEQLJkGHK9EZl5V4oOgP-WBNLtAcQ5J07~z8JrhEcXSnmces1-oDunA2nnZuninpHlE7f4HI6mrmnpinL6fgAixG-4Ousa1eCz2ygl~RbPsWdDqjejW2o6I6PhQaHEN7kqPfrN2GRDZyGIr2F3L9g2GFs9ZzGCbF9xoaweRN0t~nXBGBWh6ePoa6T~97uv8FqN76aYiny-tABt2kODJayMxKQK143U1jvkLMUK7fsQIlDJGR8eqhsNN46FNYpuYiDpGWS7R2XMWqExuN7QkJZ9rgaucRQscBEWulo4xNB~4SrpfNfdyFfVT32DYThM8L8BKF-xdyuYlDrnQEz2AoDDXEc9SisS3A4aR8aorqO~bv7dN6wV4xeHXIUqCoAI60iHvpYwsQJhxsX-~EajHr9VGfMLvgEh7sOXstAAAA | | Try it [i2p] |
Currently ich crawl some central sites (forum.i2p, stat pages ...) to fill the index. If you miss your site in the index please send me an e-mail (yacysearch@mail.i2p) and i add your site to the crawler.
More info about yacy : http://www.yacy.net |
|
|
|

|
|
Posted:
Sat Apr 18, 2009 2:21 pm
|
|
|
|
|
Hey, it really works pretty well.
there was a search engine for i2p before, eepsites.i2p, but it was really horrible in its results. nothing up to date.
yours works pretty fine, gives a lot of results and is up-to date.
I just gave it a try...
Thanks. |
|
|
|

|
|
Posted:
Sat Apr 18, 2009 6:32 pm
|
|
|
Joined: 17 Apr 2009
Posts: 10
|
|
| Very nice, thanks for creating this. For my new eepsite, I see that the home page is indexed, but that the crawler doesn't appear to have followed any further links to index the contents of any deeper pages. Do I need to add any special metatags or anything to enable full site indexing, or does it just need more time? I think it would be great if people could find links to my books when they search. |
|
|
|

|
|
Posted:
Sun Apr 19, 2009 8:06 am
|
|
|
Joined: 18 Apr 2009
Posts: 10
|
|
There are many links in the queue. Please be patient  |
|
|
|

|
|
Posted:
Sun Apr 19, 2009 8:29 am
|
|
|
I2Pisshead
Joined: 06 Feb 2009
Posts: 62
|
|
I do have a first concern about the index.
Please search after "HCB6rbCB |
|
|
|

|
|
Posted:
Sun Apr 19, 2009 8:50 am
|
|
|
Joined: 18 Apr 2009
Posts: 10
|
|
| Thanks for the hint. I will refine the crawler settings. |
|
|
|

|
|
Posted:
Sun Apr 19, 2009 9:07 am
|
|
|
I2Pisshead
Joined: 06 Feb 2009
Posts: 62
|
|
Thanks, much better now!
Keep on supporting the community with the search engine! |
|
|
|

|
|
Posted:
Sun Apr 19, 2009 1:44 pm
|
|
|
|
|
Hi!
In what way did you change Yacy's settings to make it work nicely with I2p?
I'm interested!  |
|
|
|

|
|
Posted:
Sun Apr 19, 2009 1:54 pm
|
|
|
Joined: 18 Apr 2009
Posts: 10
|
|
|
|
|

|
|
Posted:
Mon Apr 20, 2009 7:19 am
|
|
|
|
|
| yacysearch wrote: | I have startet a new eepsite searchengine with yacy
|
There was something strange with output of result.
After obligotary search of "porno" [assumption from unsafe internet: the more porno search engine find, the more useful it is generally, because porno is very common] it displayed
"1-10 of 31 " and only 3 links. Then I clicked link to view 2nd page of results and it displayed ten results, I returned to 1st page and suddenly it dislplayed all 10 results. I tried to search "porno" once again and it still displayed 10 results
BTW, can it be tuned to display more different results?
Current top ten of "porn" includes
five links to "Describe CrimeProblems here. I found some kiddie porn, and it upsets me. How" titles,
four "forum.i2p ~ View topic - Lack of content, too much porn."
and one "forum.i2p ~ View topic - Where did all the porn go?" |
|
|
|

|
|
Posted:
Mon Apr 20, 2009 8:25 am
|
|
|
Joined: 18 Apr 2009
Posts: 10
|
|
It is now a little bit better i think. Not perfect but better  |
|
|
|

|
|
Posted:
Tue Apr 21, 2009 3:45 pm
|
|
|
I2Phile
Joined: 10 May 2005
Posts: 311
|
|
-DNtDzcJPnOay0skqp5UYZEKh8FGi-Vdk4snREc-V2U=
if this is your shared clients base 64 (mouse over the details link on the router console) you have made approx. 1000 requests per hour, for two hours today, for URLs in stats.i2p/cgi-bin/... today, in violation of stats.i2p's robots.txt.
stats.i2p is now down and will remain down until whoever is DOSing posts a message here that it is fixed, or until I have time to implement some blocking or DOS prevention. I would rather spend my time doing development that benefits everybody rather than implementing website defenses. Dealing with this was not on my list of things to do this week. thank you.
In addition, if you would post the User-Agent you are using, so we may craft special robots.txt rules for your crawler, assuming that you do get your crawler fixed, that would be nice. |
|
|
|

|
|
Posted:
Tue Apr 21, 2009 6:13 pm
|
|
|
Joined: 18 Apr 2009
Posts: 10
|
|
Im really sorry about this.
To prevent the access to your site, i have stopped crawling and i have set the global delay between fetching sites to 5 sec.
If your site is up again, i will take a look in your robots.txt file to find the "bug" why yacy ignoring this.
The useragent is simple yacy ( in some cases yacybot )
Sorry again. |
|
|
|

|
|
Posted:
Tue Apr 21, 2009 6:29 pm
|
|
|
Joined: 17 Apr 2009
Posts: 10
|
|
If that was you, I'm afraid you have another problem as well. That same ID was crawling my site (which is good, I want it to be indexed), but was also downloading all the .zip files and then strangely trying to load files from inside the .zips as if they were web pages, e.g. downloading this book [link: bookpusher.i2p], which is fine, and then trying to download this bogus URL [link: bookpusher.i2p], which comes from a file inside the book .zip file, and which is of course a 404 error.
No harm done really, except for the wasted bandwidth of downloading tens of megabytes of stuff they didn't really want and couldn't use. I'm just glad they didn't go after the .rar files as well; I've only got maybe a hundred .zip files, but .rar is my primary file type, and I've got a couple thousand of those. I'm happy to send out a gigabyte of books, but not when it's just wasting BPS.
Good luck on fixing things, we really need a good up to date search engine site like yours could be once you get the bugs out. |
|
|
|

|
|
Posted:
Tue Apr 21, 2009 8:22 pm
|
|
|
Joined: 18 Apr 2009
Posts: 10
|
|
| BookPusher wrote: | | ...but was also downloading all the .zip files |
Yes, i have seen this and i have deactivated all zip Parser. this must avoid downloading zip files (i hope).
| BookPusher wrote: | | Good luck on fixing things, we really need a good up to date search engine site like yours could be once you get the bugs out. |
Thanks. |
|
|
|

|
|