Efficient censorship
You are a low-level censor working for the Ministry of Media Accuracy. Part of your job is to make sure that certain words don't appear in publications. Every morning you get a fresh stack of next week's newspapers and its your job to comb through them and fix any of the words that were mistakenly included. A naive censor might simply remove the entire word completely, but you know that if you remove part of the word that still prevents the word from appearing in print. So for you it is a point of pride to remove as little of the offending word as possible while still ensuring it does not appear in print.
Task
Given strings $X$ and $Y$ output the longest substring of $X$ which does not contain $Y$ as a contiguous substring
This is code-golf so the goal is to minimize the size of your source code as measured in bytes.
Test cases
For each input output pair a possible correct output is given, your output should always be the same length, but it is not necessary to reproduce these outputs exactly.
chat, cat -> chat
escathe, cat -> esathe
homology, o -> hmlgy
ccatt, cat -> cctt
aababbabab, abab -> aaabbbab
ababab, abab -> abbab
aaaaaaaa, aaa -> aa
Japt `-h`, 7 bytes à ñÊ …
1y ago
Haskell + hgl, 14 bytes ``` …
1y ago
Vyxal, 44 bitsv2, 5.5 bytes …
1y ago
[Python 3], 102 bytes d …
1y ago
4 answers
Python 3, 102 bytes
def f(x,y):
a=[x]
for x in a:
if y not in x:return x
for i in range(len(x)+1):a+=[x[:i]+x[i+1:]]
Not exactly the most efficient, but it works. Just a BFS of all possible removals.
0 comment threads
Haskell + hgl, 14 bytes
xBl<<ss><fn<iw
Explanation
-
ss
gets all substrings of the input -
fn
filters out the substrings that don't ... -
iw
checks if the forbidden word is a contiguous substring -
xBl
gets the largest result
Reflection
There are a couple things that could be improved.
My first thought when writing this was: "I will list the substrings and just get the first one that satisfies the predicate", however there are two issues with this:
-
ss
does not output the substrings in a length based order. The order honestly seems sort of random. According to the docs "Items are ordered by the index of their final element with the empty list coming first." Which is true, however I fail to imagine a scenario in which that is the order you want to get the ouputs. I can imagine a scenario in which you want the longer substrings earlier (this exact challenge), so it should probably be changed. - The function
g1
which gives the first element satisfying a predicate, returns aMaybe'
. This makes sense, in general there might not be an element satisfying the predicate. However it is annoying. In this case I know that there always is at least one. There should be a version ofg1
which is unsafe in this regard. (There's also a typo ing1
s documentation)
Now moving onto my actual answer, I see a couple of functions I use which could be combined into reusable functions:
-
xBl
andfn
/fl
could be combined into a two functions "Longest such that" and "Longest such that not". - The above function could also be combined with
ss
to give two functions "Longest substring such that" and "Longest substring such that not", as these are both common enough set ups for code golf challenges, that they will likely come up again.
0 comment threads
Vyxal, 44 bitsv2, 5.5 bytes
ṗ'⁰c¬;ÞG
Or, try a test suite
This is like the second time this week I've used powerset for golf on a site that isn't SE.
Explained
ṗ'⁰c¬;ÞG
ṗ # Powerset of the first input
' ;ÞG # longest string s where:
c¬ # s does not contain
⁰ # the second input
1 comment thread