Blog

New Features: Crawling Frames/IFrames and Non UTF-8 encoding

Published: 2017-02-14UTC09:38:080
Updated: 2017-02-14UTC09:38:080

Hi there,

These days, web developers have started to understand the importance of UTF-8 encoding, which is widely used in the western world. However for the eastern languages for the web, some are still sticking to their respective native encoding, for eg chinese, korean or japanese websites. Curlpic wasn’t usable on this sites, so build a workaround for user to solve this.

To check if the site is using a different encoding than UTF-8, just view source, and search for ‘utf-8’. If you feel that the encoding is different eg below:

<meta http-equiv=”Content-Type” content=”text/html; charset=MS949″>

Then do check on the Non UTF-8 encoding option.

The second improvement was crawling frames/iframes. This uses more resources as each frame in the frameset has to be crawled as well as iframes. We’ve added this as an option for the user to choose, as we do not want to set crawling frames/iframes as the default for all users.

That’s all for now. Happy crawling.

 

Crawling Background Images

Published: 2016-12-18UTC09:21:490
Updated: 2016-12-18UTC09:36:570

Hi there,

Websites these days are getting more dynamic with various ways to place pictures, including using div tag with background images.

So, we’ve decided to add crawling divs with background images embedded in them. For eg, Google Photos uses this in the html below:

<div class="RY3tic" data-latest-bg="https://lh3.googleusercontent.com/aW-SN2JzonOY9dnA8qAw5jNHiqkPFjX4g1PkVSLHxGpW45hiRHU7CpbOLwgRfTspukvhDIxk2ohrqXKC8456rl8YQ9Pph97eTrN7y7GbuQ1okWM02b-sYlsBXnVAgURARh_cpw=w564-h317-no" style="opacity: 1; background-image: url(https://lh3.googleusercontent.com/aW-SN2JzonOY9dnA8qAw5jNHiqkPFjX4g1PkVSLHxGpW45hiRHU7CpbOLwgRfTspukvhDIxk2ohrqXKC8456rl8YQ9Pph97eTrN7y7GbuQ1okWM02b-sYlsBXnVAgURARh_cpw=w564-h317-no&quot;), url(&quot;https://lh3.googleusercontent.com/aW-SN2JzonOY9dnA8qAw5jNHiqkPFjX4g1PkVSLHxGpW45hiRHU7CpbOLwgRfTspukvhDIxk2ohrqXKC8456rl8YQ9Pph97eTrN7y7GbuQ1okWM02b-sYlsBXnVAgURARh_cpw=w619-h348-no);"></div>
This works for both direct and thumbnail crawling.

Try it out, and do let us know any feedback  at support@curlpic.com

style-background-images

Updates: Too many 1px pixel results…

Published: 2016-12-10UTC04:29:250
Updated: 2016-12-10UTC04:31:400

Hi guys,

Since we’ve added the ability to crawl gif and png images, it seems like most aren’t intended to be downloaded. Thus we’ve added a minimum 10px dimension (width and height) filter for potential pictures. This should eliminate all the small gifs that are mostly used by web designers to do padding.

Another thing is to make the responsiveness of the “stop” download button better. More checks is being made to make it stop `faster`, but well not really immediately fast, but at least it won’t take forever.

Do drop us any feedback at support@curlpic.com or at our Facebook page https://www.facebook.com/curlpic

 

 

Updates: Improving crawl results

Published: 2016-12-03UTC08:25:150
Updated: 2016-12-03UTC08:25:150

Hi there,

At curlpic, we are always trying to give users better images results. From the data we gathered, we’ve found out:

  1. Too many similar links.
  2. Certain sites loads too slow, thus not able to load the images.

Therefore, we’ve done some updates to fix this, and thus are able to overcome some issues with certain websites that aren’t returning the results users want.

Thanks for trying us out.

 

Google Images Search Crawl

Published: 2016-11-15UTC06:00:450
Updated: 2016-11-15UTC06:00:450

Hi there,

We’ve added a cool feature recently. Now, you can put in the link for google search images, for eg if you search for ‘cute cat’ in Google, the link will look like this

https://www.google.com/search?q=cute+cat&espv=2&biw=1490&bih=733&source=lnms&tbm=isch&sa=X&ved=0ahUKEwiA-qqtiKrQAhUYTI8KHb7nCj4Q_AUICCgB&dpr=0.9

Then paste the URL in curlpic.com, and make sure the “Search thumbnails” is ticked. This will crawl the actual original images indexed by Google.

Do let us know if this works for you 🙂

curlpic-screen-google-images-crawl

 

 

Now How Do Start Using This?

Published: 2016-02-26UTC14:22:030
Updated: 2016-02-26UTC14:37:210

Ok, you’ve stumbled upon curlpic. Now how the heck would this tool be useful to you?

Imagine you have this link: http://www.primeportal.net/tanks/carrey/t-10/

primeportal-t-10-tank-walkaround-curlpic

It’s a walkaround gallery of the famous T-10 Russian tank.

Now, it’s basically a simple gallery with thumbnails, which leads to  the actual image that you want. So you might be thinking, “well, I could click one by one, then right click on the actual image, then download it”. That’s not so bad if you wanted to download just a few.

Or, a better idea, is to use curlpic, to download all of them for you.

curlpic-screenshot-tutorial

Just enter the link in the URL field, then check on the “Search thumbnails”. It’ll then proceed to download those images in one shot.

Now, that is only to download all images from a single page. What if you want to grab the rest of the pictures, from page 2 onwards? There’s where our Premium tools come handy. It’ll grab the pictures from the current page, then move on to the next one.


If you want to start the download, say from page 5, then just enter 5, and voila, it’ll start the downloads from page 5. 

If you get tired of waiting, just press the Stop button, and it’ll zip up the pictures downloaded so far, then you can download them in a zip file via a link.

We’ll, keep the “Largest Image Possible” for our next posting.

Hope this helps you on your first curlpic download.