The option ' -X' of wget command does not exclude all folders that starts with "/~"

Solution Verified - Updated -

Issue

  • To write a simple one-liner to update website cache (pull everything from the backend through the caching system in order to refresh the cache). Using wget, since it's already on servers and it's easy to use. Requirement is to exclude any folder that starts with "/~".

Started with this:

wget -4 -nd --no-cache --no-cookies --header="Host: www.example.com"\
 --max-redirect=0 -r -l 10 --delete-after -p --ignore-case -X '/~*'\
 -D www.example.com www.example.com

but still getting hits from sites starting with /~. Found that the wildcards in wget don't match directory separators, so added some directory separators and tried again:

wget -4 -nd --no-cache --no-cookies --header="Host: www.example.com"\
 --max-redirect=0 -r -l 10 --delete-after -p --ignore-case\
 -X '/~*,/~*/*,/~*/*/*,/~*/*/*/*,/~*/*/*/*/*,/~*/*/*/*/*/*,\
/~*/*/*/*/*/*/*,/~*/*/*/*/*/*/*/*,/~*/*/*/*/*/*/*/*/*,\
/~*/*/*/*/*/*/*/*/*/*,/~*/*/*/*/*/*/*/*/*/*/*' -D www.example.com www.example.com

and still getting hits from sites with /~.

Environment

  • Red Hat Enterprise Linux (RHEL) 6
  • Red Hat Enterprise Linux (RHEL) 5

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content