View Full Version : Plucker and website - need some help
ricktt3
12-10-2007, 04:50 PM
I have been trying to find a way to get the current Florida statutes on my PDA. I figure the best way to do this is to use Plucker to get the data fromthis website. (http://www.leg.state.fl.us/Statutes/index.cfm?Mode=View%20Statutes&Submenu=1&Tab=statutes&CFID=35561099&CFTOKEN=47567487) Trouble is, I just want the statutes, I don't want the other information and links. You think there is any way to get this or do I need to try something else? Thanks in advance for any advice or help!
sgosnell
12-10-2007, 04:53 PM
I haven't used Plucker in some time, but you may be able to exclude certain URLs. You can certainly restrict it to the domain, and even the directory. I don't know if the statutes are all in the same directory, but if they are that should work.
Off topic: Why would a Tiger want to know the Florida statutes? :eek: :cool:
PinCushionQueen
12-10-2007, 05:19 PM
Using Sunrise, you can set up a wild card filter either for inclusion or exclusion. I usually look at the URLs for the parts I want to keep and see if there is a pattern that they all have in common and then I look at the parts that I don't want for a possible pattern.
For example from the site that you provided I would try to make an "Inclusion" filter for something like this: http://www.leg.state.fl.us/Statutes/* where * is the wildcard and the links all share the same info up to that point.
Sometimes you have to play around a bit to find a good filter. Otherwise, you could do multiple Exclusion filters for all of the parts you don't want.
ricktt3
12-11-2007, 09:26 AM
My goal with this website is to just keep the statutes and do away with all the content so it won't show up in the Plucker output. I have used the exclusions to exclude links, but will the exclusions leave off the rest of it? I will see if I can spend some time with it today. Thanks!
Off topic: Why would a Tiger want to know the Florida statutes?
:) I live in FL. Some conversation came up recently at work where this info would have been helpful (actually it would have helped me win an argument!).
philpalm
12-11-2007, 11:15 AM
I tried using sunrise for this section:
http://www.leg.state.fl.us/Statutes/index.cfm?App_mode=Display_Statute&Search_String=&URL=Ch0320/SEC01.HTM&Title=->2007->Ch0320->Section%2001#0320.01
And got this message:
java.lang.IllegalArgumentException
at java.net.URI.create(Unknown Source)
at com.distantchord.sdl.Source.setRestrict(Unknown Source)
at com.distantchord.sdl.Source.setPath(Unknown Source)
at com.distantchord.sunrise.apps.Desktop$QuickDocumentHandler$1.run(Unknown Source)
at org.eclipse.swt.widgets.RunnableLock.run(Unknown Source)
at org.eclipse.swt.widgets.Synchronizer.runAsyncMessages(Unknown Source)
at org.eclipse.swt.widgets.Display.runAsyncMessages(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at com.distantchord.swt.Window.executeEventThread(Unknown Source)
at com.distantchord.sunrise.apps.Desktop.run(Unknown Source)
at com.distantchord.sunrise.apps.Desktop.main(Unknown Source)
--------------------------------------------------
So it looks like Plucker/sunrise had Java problems....
sgosnell
12-11-2007, 01:31 PM
The problem is probably trying to run Sunrise pointing to a dynamic source. Delete all the search terms and it should work. Try http://www.leg.state.fl.us/Statutes/. You will then need to set the excludes so you don't get the rest of the links you don't want. Php and plucking don't play together nicely.
philpalm
12-11-2007, 04:54 PM
I could copy each page and transfer it to a rtf...but I am not proficient enough to delete the search terms using sunrise.
Using the regular Plucker desktop, you put down the URL and then set the dept to 2 or to 3? Anyway the original poster probably figured out a way to download the statute, I was just fooling around, which is necessary to do to learn how to do things.
sgosnell
12-11-2007, 05:18 PM
To change the URL, just remove everything after the end of what you want, just like you would do in a browser. The depth depends on how the page is constructed. I usually start with 2, but often change it to 1 or 3, depending on how the initial conversion goes. Be wary of setting it higher than 3, because the conversion may never end, because it tries to get every page on the web, or close to it. I don't know of a good way to tell exactly what depth is perfect for everything. The ideal is where everything is on the same page and you can use 1. If it's not it's a matter of how deep the site goes.
vBulletin v3.0.3, Copyright ©2000-2012, Jelsoft Enterprises Ltd.