diff --git a/README.md b/README.md index 89a2247..465c2d0 100644 --- a/README.md +++ b/README.md @@ -1,30 +1,26 @@ # adguardhome-filters Hosts lists from Steven Black (https://github.com/StevenBlack/hosts) - Cleaned-up from "localhost" records. - 127.0.0.1 replaced to 0.0.0.0 - Extensions are left unmerged. + cleaned-up from "localhost" records; + 127.0.0.1 and 0.0.0.0 replaced to ||; + extensions are left unmerged; + left most top-domain only; Files are used for AdGuard Home DNS filtering. -P.S. Looking for the way to translate easily multiple hostname records +P.S. Looking for the intellectual algorithm to translate easily multiple hostname records to one line accordingly to general AdBlock rules set, i.e. - www.abc.com - abc.com - external.www.abc.com.site + www1.abc.com + extra.abc.com + external.www.abc.com + pictures.domain.com + pictures1.domain.co.nz + pic-tures.domain.net.site to ||abc.com*^ + ||domain.*^ or similar. - -Need to build the following algorithm: -1. grab original file; -2. mirror each string and sort ( like 'cat ./file1 | rev | sort > file2' ); -3. moving down, remember each string and compare it with all the rest, deleting all longer ones ( "moc.cba||" -> delete all "moc.cba\.*" ); -4. revert strings back and sort; -5. done. - -Better to be written in bash/sed/awk or python or Go.