renting real estate scrappers by jdesi22 · Pull Request #52 · formalsec/webcap

jdesi22 · 2026-03-01T13:23:47Z

adapted timeout
added real estate scrappers for renting instead of buying

frediramos

Alguns scrapers não funcionarem bem. Mais detalhes em baixo.

frediramos · 2026-03-04T21:03:36Z

src/webscraper/websites/renting/casasapo.py

Este scraper não me pareceu funcionar bem.
Obtive isto quando corri:

> python3 -m webscraper run casasapo-rent Info: Running webscraper Info: mode: run Info: website: casasapo-rent Info: database: /home/ramos/webcap/src/databases/webscraper.db Info: Entering Page: https://www.casa.sapo.pt/alugar-apartamentos/distrito.Lisboa/ Warning: Could not extract any listing url from the following page: 'https://www.casa.sapo.pt/alugar-apartamentos/distrito.Lisboa/' Info: Data saved to data.json

Estive a verificar e parece-me haver um problema de carregamento do site.
Não de execução do scrapper. Se correres a versão de compra do casasapo, o mesmo problema ocorre.
Acho que o problema é só deixar o site carregar?
Não sei se é viável forçar um wait até que o site carregue nestes scrapers?

Não sei se é viável forçar um wait até que o site carregue nestes scrapers?

Parece-me uma boa ideia.

frediramos · 2026-03-04T21:09:38Z

src/webscraper/websites/renting/imovirtual.py

O scraper do imovirtual inicialmente tinha um problema que eu corrigi por aproveitei para adicionarma funcionalidade.
Em todo o caso, podes ver o fix aqui e dar tua passagem.

(O fallback estava ter um None na location)

frediramos · 2026-03-04T21:17:16Z

src/webscraper/websites/renting/properstar.py

+    def extract_bathrooms(self, webcap: WebCap, **kwargs) -> int | None:
+        extracted = webcap.fetch("#X6", XPATHS["bathrooms"])
+        if extracted:
+            return self._clean_bathrooms(extracted)


Está a chegar uma lista com o valor [1, 1] ao _clean_bathrooms em vez de uma string.
Se conseguires corrigir melhor. Se não podes usar o novo decorator @cleaner:

@cleaner def _clean_bathrooms(self, desc: str, warning=True):

O decorator faz com que seja retornado None e uma mensagem de warning é logged.

frediramos · 2026-03-04T21:17:33Z

src/webscraper/websites/renting/supercasa.py

+}
+
+
+class Supercasa(RealEstate):


Funcionou bem parece-me :)

frediramos · 2026-03-04T21:19:48Z

PS: Já que este scrapers 'herdam' dos originais do real estate, talvez faça sentido verificar se têm os mesmos erros.

(Just for clarity, since we have quite a few now)

dict_flatten -> list_flatten

(To build_dataclass)

jdesi22 requested a review from frediramos March 1, 2026 13:24

frediramos requested changes Mar 4, 2026

View reviewed changes

frediramos force-pushed the instance-support branch from 7917a23 to d2cd979 Compare March 12, 2026 17:19

jdesi22 and others added 11 commits March 16, 2026 10:42

increase default timeout for page operations to improve reliability

f035ff6

feat: add rental support for multiple websites

4ef9a81

Improve WebsiteOpt Enum

0405418

(Just for clarity, since we have quite a few now)

Fixup

7a33de3

Remove homepage reference

438a491

Fix function name

cc8b237

dict_flatten -> list_flatten

Allow passing extra args besides subtree

b8e9e42

(To build_dataclass)

Fix imovirtual-rent scraper

f7594c8

Fix imovirtual scraper

31cf41a

Use new xpath IDs

9ffbdf6

fix: solve properstar issues

32ea738

frediramos force-pushed the instance-support branch from ffb5f57 to 32ea738 Compare March 16, 2026 10:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

renting real estate scrappers#52

renting real estate scrappers#52
jdesi22 wants to merge 11 commits intomainfrom
instance-support

jdesi22 commented Mar 1, 2026

Uh oh!

frediramos left a comment •

edited

Loading

Uh oh!

frediramos Mar 4, 2026

Uh oh!

jdesi22 Mar 5, 2026

Uh oh!

frediramos Mar 5, 2026 •

edited

Loading

Uh oh!

frediramos Mar 4, 2026

Uh oh!

frediramos Mar 4, 2026

Uh oh!

frediramos Mar 4, 2026

Uh oh!

frediramos commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		}


		class Supercasa(RealEstate):

Conversation

jdesi22 commented Mar 1, 2026

Uh oh!

frediramos left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

frediramos Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

jdesi22 Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

frediramos Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

frediramos Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

frediramos Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

frediramos Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

frediramos commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

frediramos left a comment •

edited

Loading

frediramos Mar 5, 2026 •

edited

Loading