Performance problems with large insertions

As you can see, comparing `AAAA...` with `A` is instant, but comparing `A` with `AAAA...` takes a lot of time:

```
In [5]: %timeit extractor.describe_dna('A' * 10000, 'A')
10000 loops, best of 3: 129 µs per loop

In [6]: %timeit extractor.describe_dna('A', 'A' * 10000)
1 loops, best of 3: 1.13 s per loop
```

Perhaps more importantly, memory usage also sky rockets. I couldn't run this test with a 50 Kbp sample sequence on a machine with 4G memory, completely freezing my machine for half a minute. I would like to prevent this from happening on the server.

I didn't look into this further, but I suspect it tries to find the inserted sequence in the original sequence, which of course is not possible. Could this be an easy case to optimize?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance problems with large insertions #9

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance problems with large insertions #9

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions