This is the eepSites cache  for 'i2cp' from http://forum.i2p/viewtopic.php?t=3347 on 7/16/2009. The page may have changed since that time.
We respect the robots.txt file and the <meta name="robots" content="noarchive"> tag to inhibit caching.
VISI is neither affiliated with the authors of this page nor responsible for its content.

forum.i2p Forum Index  skip navigation
  
FAQ  Search  Memberlist  Usergroups  Profile  Log in to check your private messagesLog in   Register
Author Message
yacysearch.i2p - eepsite search engine
yacysearch
PostPosted: Sat Apr 18, 2009 8:47 am  Reply with quote




Joined: 18 Apr 2009
Posts: 10

I have startet a new eepsite searchengine with yacy

I2P Key - (paste this line into your I2P hosts.txt file)
Try it [i2p]


Currently ich crawl some central sites (forum.i2p, stat pages ...) to fill the index. If you miss your site in the index please send me an e-mail (yacysearch@mail.i2p) and i add your site to the crawler.

More info about yacy : http://www.yacy.net
Back to top
View user's profile Send private message Visit poster's website


Guest
PostPosted: Sat Apr 18, 2009 2:21 pm  Reply with quote







Hey, it really works pretty well.

there was a search engine for i2p before, eepsites.i2p, but it was really horrible in its results. nothing up to date.

yours works pretty fine, gives a lot of results and is up-to date.
I just gave it a try...

Thanks.
Back to top


BookPusher
PostPosted: Sat Apr 18, 2009 6:32 pm  Reply with quote




Joined: 17 Apr 2009
Posts: 10

Very nice, thanks for creating this. For my new eepsite, I see that the home page is indexed, but that the crawler doesn't appear to have followed any further links to index the contents of any deeper pages. Do I need to add any special metatags or anything to enable full site indexing, or does it just need more time? I think it would be great if people could find links to my books when they search.
Back to top
View user's profile Send private message Visit poster's website


yacysearch
PostPosted: Sun Apr 19, 2009 8:06 am  Reply with quote




Joined: 18 Apr 2009
Posts: 10

There are many links in the queue. Please be patient Cool
Back to top
View user's profile Send private message Visit poster's website


brother
PostPosted: Sun Apr 19, 2009 8:29 am  Reply with quote
I2Pisshead



Joined: 06 Feb 2009
Posts: 62

I do have a first concern about the index.
Please search after "HCB6rbCB
Back to top
View user's profile Send private message Visit poster's website


yacysearch
PostPosted: Sun Apr 19, 2009 8:50 am  Reply with quote




Joined: 18 Apr 2009
Posts: 10

Thanks for the hint. I will refine the crawler settings.
Back to top
View user's profile Send private message Visit poster's website


brother
PostPosted: Sun Apr 19, 2009 9:07 am  Reply with quote
I2Pisshead



Joined: 06 Feb 2009
Posts: 62

Thanks, much better now!
Keep on supporting the community with the search engine!
Back to top
View user's profile Send private message Visit poster's website


Guest
PostPosted: Sun Apr 19, 2009 1:44 pm  Reply with quote







Hi!

In what way did you change Yacy's settings to make it work nicely with I2p?
I'm interested! Very Happy
Back to top


yacysearch
PostPosted: Sun Apr 19, 2009 1:54 pm  Reply with quote




Joined: 18 Apr 2009
Posts: 10

There is an article in the wiki: http://yacy-websuche.de/wiki/index.php/De:YaCy-i2p (only german)
Back to top
View user's profile Send private message Visit poster's website


Guest
PostPosted: Mon Apr 20, 2009 7:19 am  Reply with quote








yacysearch wrote:
I have startet a new eepsite searchengine with yacy



There was something strange with output of result.
After obligotary search of "porno" [assumption from unsafe internet: the more porno search engine find, the more useful it is generally, because porno is very common] it displayed
"1-10 of 31 " and only 3 links. Then I clicked link to view 2nd page of results and it displayed ten results, I returned to 1st page and suddenly it dislplayed all 10 results. I tried to search "porno" once again and it still displayed 10 results Confused

BTW, can it be tuned to display more different results?
Current top ten of "porn" includes
five links to "Describe CrimeProblems here. I found some kiddie porn, and it upsets me. How" titles,
four "forum.i2p ~ View topic - Lack of content, too much porn."
and one "forum.i2p ~ View topic - Where did all the porn go?"
Back to top


yacysearch
PostPosted: Mon Apr 20, 2009 8:25 am  Reply with quote




Joined: 18 Apr 2009
Posts: 10

It is now a little bit better i think. Not perfect but better Very Happy
Back to top
View user's profile Send private message Visit poster's website


zzz
PostPosted: Tue Apr 21, 2009 3:45 pm  Reply with quote
I2Phile



Joined: 10 May 2005
Posts: 311

-DNtDzcJPnOay0skqp5UYZEKh8FGi-Vdk4snREc-V2U=

if this is your shared clients base 64 (mouse over the details link on the router console) you have made approx. 1000 requests per hour, for two hours today, for URLs in stats.i2p/cgi-bin/... today, in violation of stats.i2p's robots.txt.

stats.i2p is now down and will remain down until whoever is DOSing posts a message here that it is fixed, or until I have time to implement some blocking or DOS prevention. I would rather spend my time doing development that benefits everybody rather than implementing website defenses. Dealing with this was not on my list of things to do this week. thank you.

In addition, if you would post the User-Agent you are using, so we may craft special robots.txt rules for your crawler, assuming that you do get your crawler fixed, that would be nice.
Back to top
View user's profile Send private message


yacysearch
PostPosted: Tue Apr 21, 2009 6:13 pm  Reply with quote




Joined: 18 Apr 2009
Posts: 10

Im really sorry about this.
To prevent the access to your site, i have stopped crawling and i have set the global delay between fetching sites to 5 sec.

If your site is up again, i will take a look in your robots.txt file to find the "bug" why yacy ignoring this.
The useragent is simple yacy ( in some cases yacybot )

Sorry again.
Back to top
View user's profile Send private message Visit poster's website


BookPusher
PostPosted: Tue Apr 21, 2009 6:29 pm  Reply with quote




Joined: 17 Apr 2009
Posts: 10

If that was you, I'm afraid you have another problem as well. That same ID was crawling my site (which is good, I want it to be indexed), but was also downloading all the .zip files and then strangely trying to load files from inside the .zips as if they were web pages, e.g. downloading this book [link: bookpusher.i2p], which is fine, and then trying to download this bogus URL [link: bookpusher.i2p], which comes from a file inside the book .zip file, and which is of course a 404 error.

No harm done really, except for the wasted bandwidth of downloading tens of megabytes of stuff they didn't really want and couldn't use. I'm just glad they didn't go after the .rar files as well; I've only got maybe a hundred .zip files, but .rar is my primary file type, and I've got a couple thousand of those. I'm happy to send out a gigabyte of books, but not when it's just wasting BPS.

Good luck on fixing things, we really need a good up to date search engine site like yours could be once you get the bugs out.
Back to top
View user's profile Send private message Visit poster's website


yacysearch
PostPosted: Tue Apr 21, 2009 8:22 pm  Reply with quote




Joined: 18 Apr 2009
Posts: 10


BookPusher wrote:
...but was also downloading all the .zip files



Yes, i have seen this and i have deactivated all zip Parser. this must avoid downloading zip files (i hope).


BookPusher wrote:
Good luck on fixing things, we really need a good up to date search engine site like yours could be once you get the bugs out.



Thanks.
Back to top
View user's profile Send private message Visit poster's website


Display posts from previous:   
All times are GMT

View next topic
View previous topic
Page 1 of 2
Goto page 1, 2  Next
forum.i2p Forum Index -> Eepsite Announce

Post new topic   Reply to topic


 
You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



NoseBleed v1.00 ~ mikelothar.com
(http://www.mikelothar.com/community)


Forum software: php BB (http://www.php bb.com) v2 © 1976 php BB Group