The option ' -X' of wget command does not exclude all folders that starts with "/~"

Solution Verified - Updated -

Issue

  • To write a simple one-liner to update website cache (pull everything from the backend through the caching system in order to refresh the cache). Using wget, since it's already on servers and it's easy to use. Requirement is to exclude any folder that starts with "/~".

Started with this:

wget -4 -nd --no-cache --no-cookies --header="Host: www.example.com"\
 --max-redirect=0 -r -l 10 --delete-after -p --ignore-case -X '/~*'\
 -D www.example.com www.example.com

but still getting hits from sites starting with /~. Found that the wildcards in wget don't match directory separators, so added some directory separators and tried again:

wget -4 -nd --no-cache --no-cookies --header="Host: www.example.com"\
 --max-redirect=0 -r -l 10 --delete-after -p --ignore-case\
 -X '/~*,/~*/*,/~*/*/*,/~*/*/*/*,/~*/*/*/*/*,/~*/*/*/*/*/*,\
/~*/*/*/*/*/*/*,/~*/*/*/*/*/*/*/*,/~*/*/*/*/*/*/*/*/*,\
/~*/*/*/*/*/*/*/*/*/*,/~*/*/*/*/*/*/*/*/*/*/*' -D www.example.com www.example.com

and still getting hits from sites with /~.

Environment

  • Red Hat Enterprise Linux (RHEL) 6
  • Red Hat Enterprise Linux (RHEL) 5

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.