The new updated list of CH DVDs (data miner)
#426
DVD Talk Gold Edition
Does this display properly for anyone using Open Office?
The columns seem to display multiple data in each rather than each being exclusive to its own (ie. selection numbers mixed in with all sorts of other data like widescreen, etc).
Is this something I can fix with a setting change?
The columns seem to display multiple data in each rather than each being exclusive to its own (ie. selection numbers mixed in with all sorts of other data like widescreen, etc).
Is this something I can fix with a setting change?
#428
DVD Talk Gold Edition
OOo properly recognizes and opens it as delimited but the data is still being displayed as above. Is there anyone else using OOo that is experiencing this?
I've tried both semi-colon and comma delimted (and both together) but no change.
I don't know enough about how this works but there does appear to be duplicate commas and unneeded characters within the csv file (not sure if they are needed or not or could even be the cause but, without additional eyes, is the only thing I can think of).
I've tried both semi-colon and comma delimted (and both together) but no change.
I don't know enough about how this works but there does appear to be duplicate commas and unneeded characters within the csv file (not sure if they are needed or not or could even be the cause but, without additional eyes, is the only thing I can think of).
#429
DVD Talk Platinum Edition
Join Date: Jul 2003
Location: New Hampshire
Posts: 3,096
Likes: 0
Received 0 Likes
on
0 Posts
Originally Posted by basaro
OK guys, I have something small and slightly usable up now at:
http://www.saladbarscam.com/ch
Note the www. and /ch are required, otherwise you won't get there.
It's a rough sample of what will ultimately be available. I need to do a lot more work on it. Now it's just a sortable list of all the titles (from jhester's first datamine) available only at 50 titles at a time (the 50 limit is temporary for this trial). I will update it with his second datamine soon, and work on the other data from nickelplated as well.
I welcome all comments and suggestions, positive or negative. I will announce the rest of my plans later. I'm in a rush this afternoon!!
See ya.
Oh yeah, I designed and tested this with Firefox! If something looks wacky in IE, switch browsers and let me know, OK ?
http://www.saladbarscam.com/ch
Note the www. and /ch are required, otherwise you won't get there.
It's a rough sample of what will ultimately be available. I need to do a lot more work on it. Now it's just a sortable list of all the titles (from jhester's first datamine) available only at 50 titles at a time (the 50 limit is temporary for this trial). I will update it with his second datamine soon, and work on the other data from nickelplated as well.
I welcome all comments and suggestions, positive or negative. I will announce the rest of my plans later. I'm in a rush this afternoon!!
See ya.
Oh yeah, I designed and tested this with Firefox! If something looks wacky in IE, switch browsers and let me know, OK ?
Is there a new import coming soon? It will only take me an hour or so to import a new datamine into my database now that things seem to be going smoothly for me. And soon, if jhester wants, he can upload the file to my server himself, and it will be taken care of on it's own.
I also jacked my sorted listing count up to 1000 at a time. It might take a little longer to load, but it is probably much more useful this way.
Cheers
#430
Senior Member
Join Date: Jun 2003
Posts: 270
Likes: 0
Received 0 Likes
on
0 Posts
Basaro, you da man!! I can't tell you how much the work that you and the others working to get CH titles obtained and compiled is appreciated! Keep up the awesome work!
Originally Posted by basaro
I updated my site with the last import from August. It still needs work, but it's coming along (the data is all accurate though). The next thing will be to fix a couple of the issues I mentioned before (which I haven't got around to yet), and to add a list of pre-order titles. I also have to work on incorporating the item number since it is now available in jhester's data, then I can create direct links to CH! I am also going to clean up the listing in the import update section, and obviously work on the ui. There are some accurate counts on the homepage now, but there are still a few more ways I want to organize the data (like regular price changes).
Is there a new import coming soon? It will only take me an hour or so to import a new datamine into my database now that things seem to be going smoothly for me. And soon, if jhester wants, he can upload the file to my server himself, and it will be taken care of on it's own.
I also jacked my sorted listing count up to 1000 at a time. It might take a little longer to load, but it is probably much more useful this way.
Cheers
Is there a new import coming soon? It will only take me an hour or so to import a new datamine into my database now that things seem to be going smoothly for me. And soon, if jhester wants, he can upload the file to my server himself, and it will be taken care of on it's own.
I also jacked my sorted listing count up to 1000 at a time. It might take a little longer to load, but it is probably much more useful this way.
Cheers
#432
DVD Talk Platinum Edition
Join Date: Jul 2003
Location: New Hampshire
Posts: 3,096
Likes: 0
Received 0 Likes
on
0 Posts
Fixed a couple things last night and added the pre-orders. Search is coming next. I've been working on something, but it's soooo slow right now. Needs some tweaking.
We need a new import though! Jhester? I would be happy to run the script every week, and keep the db updated, if you cannot. I am still feeling spoiled from when bga used to do this weekly.
We need a new import though! Jhester? I would be happy to run the script every week, and keep the db updated, if you cannot. I am still feeling spoiled from when bga used to do this weekly.
#433
Member
Join Date: Mar 2002
Posts: 135
Likes: 0
Received 0 Likes
on
0 Posts
No I'm not dead yet (Despite appearances from my absence!) I find myself in school and covered up with work. So I apologize for not keeping my data more current. I can't make any promises, but I will try to work on getting an updated version. However, my process is not ideal, and I really hope that some other brave soul will step up with a process with better integrity than mine.
#435
Member
Join Date: Jan 2003
Location: Simpsonville SC
Posts: 185
Likes: 0
Received 0 Likes
on
0 Posts
I am messing around with a data miner for CH. I basically have it working but need to know the range of the Item numbers. I am pretty sure that all Item #s are between 1500000 and 1799999. Does anyone know if there is anything outside this range or if it can be tightened more.
#436
DVD Talk Platinum Edition
Join Date: Jul 2003
Location: New Hampshire
Posts: 3,096
Likes: 0
Received 0 Likes
on
0 Posts
AlfB,
I saw your other thread and responded there, but I was going to ask you post additional comments here, but you beat me to it! So here is what I posted in the other thread, ignore that one, and please just followup here, thanks!
Cool. I haven't had time to write my own data miner. If you can generate this in a csv format similar to what jhester had done, I can import it into my db as well.
Based on the last import that I did from jhester, there were 10,527 total items: Starting with 1548333 "10" and ending with 1776444 "The Polar Express Gift Set". You might be able to tighten up your query a little more based on this, but I can't guarantee there aren't titles outside this range.
You're on the right track, thanks for your contribution!
I saw your other thread and responded there, but I was going to ask you post additional comments here, but you beat me to it! So here is what I posted in the other thread, ignore that one, and please just followup here, thanks!
Cool. I haven't had time to write my own data miner. If you can generate this in a csv format similar to what jhester had done, I can import it into my db as well.
Based on the last import that I did from jhester, there were 10,527 total items: Starting with 1548333 "10" and ending with 1776444 "The Polar Express Gift Set". You might be able to tighten up your query a little more based on this, but I can't guarantee there aren't titles outside this range.
You're on the right track, thanks for your contribution!
#437
Member
Join Date: Jan 2003
Location: Simpsonville SC
Posts: 185
Likes: 0
Received 0 Likes
on
0 Posts
I looked at the file and decided to start with 1500000 and go through 1799999. I have already done the 1500000-1599999. I will be doing 16 tonight and 17 tommorrow night since it takes about 8-9 hours for each one. One gotcha for the file is that I am not capturing pricing as I am not currently a member. I am getting enrollment versus non enrollment which was my main goal here to begin with. Currently the data I am getting is Item #, Sel #, Enrollment/Not, Title, Rating, Num discs, Run Time, Studio, Rel Date and Format. When complete, I can provide in a comma delimited text file or an Excel Spreadsheet. What would be best way to get it to you?
Last edited by AlfB; 12-05-05 at 02:38 PM.
#439
DVD Talk Platinum Edition
Join Date: Jul 2003
Location: New Hampshire
Posts: 3,096
Likes: 0
Received 0 Likes
on
0 Posts
Originally Posted by AlfB
I looked at the file and decided to start with 1500000 and go through 1799999. I have already done the 1500000-1599999. I will be doing 16 tonight and 17 tommorrow night since it takes about 8-9 hours for each one. One gotcha for the file is that I am not capturing pricing as I am not currently a member. I am getting enrollment versus non enrollment which was my main goal here to begin with. Currently the data I am getting is Item #, Sel #, Enrollment/Not, Title, Rating, Num discs, Run Time, Studio, Rel Date and Format. When complete, I can provide in a comma delimited text file or an Excel Spreadsheet. What would be best way to get it to you?
Thanks for your hard work! It will be nice to have enrollment status again.
#440
Member
Join Date: Jan 2003
Location: Simpsonville SC
Posts: 185
Likes: 0
Received 0 Likes
on
0 Posts
Originally Posted by basaro
Please contact me through email via my profile here at dvdtalk. If this works out to be a good thing, I can give you access to my server in the future and let you add the new files at your leisure. Then everything will be updated automatically.
Thanks for your hard work! It will be nice to have enrollment status again.
Thanks for your hard work! It will be nice to have enrollment status again.
By the way, are you anywhere near Bow? My sister lives there.
Last edited by AlfB; 12-05-05 at 09:12 PM.
#441
DVD Talk Platinum Edition
Join Date: Jul 2003
Location: New Hampshire
Posts: 3,096
Likes: 0
Received 0 Likes
on
0 Posts
Originally Posted by AlfB
No problem. Will contact you when I have the complete data set. As I type this, I am at 1632082 on the second set. The last set should be complete Wed AM. Most likely it will finish too late to send that morning so you should have it early Wed evening. Do you want comma delimited text or an excel file?
By the way, are you anywhere near Bow? My sister lives there.
By the way, are you anywhere near Bow? My sister lives there.
I am about 45min from Bow. I used to live in Concord there myself for a while. I'd love to move back to that area if I can ever find a good job that isn't in Mass, but for now, I'm stuck near the border.
See my post above for a link to the website where I host this stuff if you're interested. If your data works out well, I'll start putting in the other features and start making some changes so it will be better.
Last edited by basaro; 12-06-05 at 07:03 AM.
#442
Member
Join Date: Jan 2003
Location: Simpsonville SC
Posts: 185
Likes: 0
Received 0 Likes
on
0 Posts
OK guys, bad news. I found a bug in the code and had to fix it. Unfortunately it gave an erroneous result in some cases for the enrollment/not field. I've fixed it and started the run again. The good news is that I had enough info to cull the run down to two nights. The bad news is that it will be another day before we have data as I will run one tonight and the next tommorrow. basaro, I will send it as soon as I have it. Most likely Thursday evening. Sorry for the delay guys.
#443
DVD Talk Platinum Edition
Join Date: Jul 2003
Location: New Hampshire
Posts: 3,096
Likes: 0
Received 0 Likes
on
0 Posts
Just an update for everyone:
AlfB has done a great job so far on his datamine. We're working out the kinks and caveats, and something will be imported into my database soon.
For the meantime I'll be putting the newest datamine csv file up on my site so you all can browse it manually for now. I'll post a link to it a little later this afternoon. I'm off to get supplies for football right now.
Big to AlfB!
AlfB has done a great job so far on his datamine. We're working out the kinks and caveats, and something will be imported into my database soon.
For the meantime I'll be putting the newest datamine csv file up on my site so you all can browse it manually for now. I'll post a link to it a little later this afternoon. I'm off to get supplies for football right now.
Big to AlfB!
#444
DVD Talk Platinum Edition
Join Date: Jul 2003
Location: New Hampshire
Posts: 3,096
Likes: 0
Received 0 Likes
on
0 Posts
I posted up AlfB's datamine file on my site. Lots of interesting data in it like future titles that aren't even available for pre-order yet! This could be a good reference for looking up Columbia Tri-Star and Universal titles which never show up until release date! I haven't verified that in the data yet, but it all seems possible.
Go to the news section I just added to my site. Again, nothing special, but it gets the job done for now.
Sorry for the delay on my end too. My server crashed on me the other day,really really need a new one now
http://www.saladbarscam.com/web/ch.nsf
Thanks AlfB!
Go to the news section I just added to my site. Again, nothing special, but it gets the job done for now.
Sorry for the delay on my end too. My server crashed on me the other day,really really need a new one now
http://www.saladbarscam.com/web/ch.nsf
Thanks AlfB!
#445
DVD Talk Gold Edition
Looks pretty good. I'm not sure if this is just an Open Office issue but some of the titles need fine tuning as the seperation of fields are getting confused by some extra commas (I think).
Perhaps there is an easy, and automated, way of removing all commas from any fields so as not to confuse their seperations?
For examples (there are more though)..
They Shoot Movies Don't They? - the Making of Mirage
The Jeff Corwin Experience: Out on a Limb - Monkeys
Good Night and Good Luck
Perhaps there is an easy, and automated, way of removing all commas from any fields so as not to confuse their seperations?
For examples (there are more though)..
They Shoot Movies Don't They? - the Making of Mirage
The Jeff Corwin Experience: Out on a Limb - Monkeys
Good Night and Good Luck
#446
DVD Talk Platinum Edition
Join Date: Jul 2003
Location: New Hampshire
Posts: 3,096
Likes: 0
Received 0 Likes
on
0 Posts
Originally Posted by abintra
Looks pretty good. I'm not sure if this is just an Open Office issue but some of the titles need fine tuning as the seperation of fields are getting confused by some extra commas (I think).
Perhaps there is an easy, and automated, way of removing all commas from any fields so as not to confuse their seperations?
For examples (there are more though)..
They Shoot Movies Don't They? - the Making of Mirage
The Jeff Corwin Experience: Out on a Limb - Monkeys
Good Night and Good Luck
Perhaps there is an easy, and automated, way of removing all commas from any fields so as not to confuse their seperations?
For examples (there are more though)..
They Shoot Movies Don't They? - the Making of Mirage
The Jeff Corwin Experience: Out on a Limb - Monkeys
Good Night and Good Luck
If there is something other than the already known caveats, please let us know. I will have to make sure the data is coming in correctly before I can add it to my database and start generating stats, etc.
From my site:
Here is a list of the caveats right now:
1. The Enrollment column is opposite as stated. If Enrollment is specified, then it is actually Member Only. If Member is specified, then it is actually an Enrollment.
2. Some titles which contain commas in them, get forced into the next column and it throws the rest of that row off.
3. Some titles show up which aren't even available for pre-order yet (future releases)! Likewise there is no Pre-Order indicator at this time either.
4. There is no header for the list - Here is the current format:
* Item, Selection, Enrollment, Title, Rating, Discs, Time, Studio, Year, Format
#1,2 & 4 should all be fixed for the next datamine. #3 we're not so sure about how to fix that yet.
Thanks for the input
Last edited by basaro; 12-15-05 at 03:06 PM.
#448
DVD Talk Hall of Fame
Thank you basaro and AlfB! I was wondering if Columbia House was going to carry Serenity and it is on your list:
4315701 Serenity (Widescreen), list price $29.98, no sale price yet.
4316105 Serenity (Fullscreen, I think)
Since I passed on the recent BOGOF codes to save my fulfillments for Serenity, I am pleased that they are going to have it, to put it mildly.
4315701 Serenity (Widescreen), list price $29.98, no sale price yet.
4316105 Serenity (Fullscreen, I think)
Since I passed on the recent BOGOF codes to save my fulfillments for Serenity, I am pleased that they are going to have it, to put it mildly.
#450
Member
Join Date: Jan 2003
Location: Simpsonville SC
Posts: 185
Likes: 0
Received 0 Likes
on
0 Posts
Just wanted to let everyone know the current status. I was in the middle of doing another download when the ice storm hit the Carolinas. I lost power and my internet connection. I have about half the new download done and now that I have an internet connection again, I hope to finish tonight.