Skip to content

Memory leak each loop at IwIP: 2 LM & HB (not at 1.4 HB) #7059

Closed
@civilman2006

Description

@civilman2006
  • Hardware: ESP8266EX

  • Core Version: SDK:2.2.2-dev(38a443e)/Core:2.6.3=20603000/lwIP:STABLE-2_1_2_RELEASE/glue:1.2-16-ge23a07e/BearSSL:89454af

  • Development Env: Arduino IDE

  • Operating System: Windows

  • Module: LOLIN Wemos D1 mini Pro & Wemos D1 r2 mini

  • Flash Size: 16MB

  • lwip Variant: v2 Lower Memory and Higher Bandwidth

  • Flash Frequency: 40Mhz

  • CPU Frequency: 80Mhz

  • Upload Using: SERIAL

  • Upload Speed: 460800

At v2 Lower Memory and Higher Bandwidth I saw memory leak each LOOP - 32 bytes or more. Try many options, with or without debug and so on... After ~26 minutes of run ESP goes to :oom and reboot with dump. Sometimes one or two or three loops go without a leak, but than mem leak continues.

If I switch to IwIP variant 1.4 Higher Bandwidth - memory leak stops and all work fine!
( SDK:2.2.2-dev(38a443e)/Core:2.6.3=20603000/lwIP:1.4.0rc2/BearSSL:89454af - that variant work fine) I can provide debug log but it will be the same as below, exclude memory leak.

#include <PubSubClient.h>
#include <ESP8266WiFi.h>

#define ESP8266

#define DEBUG 1

const char* ssid = "zzzzzz";
const char* password = "xxxxxxx";
const char* mqtt_server = "111.222.222.222";
const char* mqtt_user = "zzzz";
const char* mqtt_pass = "zzzz";
const char* mqtt_clientId = "gost-temp";
const char* mqtt_ping_topic = "apr/home/ping";
const char* mqtt_online_message = "online";
const char* mqtt_last_will = "offline";
const char* mqtt_topic_base = "apr/home/";
const int ping_time = 30;
bool firstRun = true;

WiFiClient espClient;
PubSubClient client(espClient);

uint32_t originalram;

unsigned long lastMeasure = 0;

void setup_wifi() {
  int zz = 0;
  delay(10);
#ifdef DEBUG
  Serial.println();
  Serial.print("Connecting to ");
  Serial.println(ssid);
#endif
  WiFi.mode(WIFI_STA);
  WiFi.begin(ssid, password);
  delay(500);
  while (WiFi.status() != WL_CONNECTED) {
    delay(100);
#ifdef DEBUG
    Serial.print(".");
#endif
    zz=zz+1;
    if (zz >= 100) {
      ESP.restart();
    }
    Serial.print(zz);
  }
#ifdef DEBUG
  Serial.println("");
  Serial.print("WiFi connected - ESP IP address: ");
  Serial.println(WiFi.localIP());
#endif
  delay(1000);
}

void reconnect() {
  delay(500);
  while (!client.connected()) {
#ifdef DEBUG
    Serial.print("Attempting MQTT connection...");
#endif
    String mqtt_clientIdRand = mqtt_clientId;
    mqtt_clientIdRand += String(random(0xffff), HEX);
    if (client.connect(String(mqtt_clientIdRand).c_str(), mqtt_user, mqtt_pass, (String(mqtt_ping_topic)).c_str(), 0, 1, mqtt_last_will)) {
#ifdef DEBUG
      Serial.println("connected");
#endif
    } else {
#ifdef DEBUG
      Serial.print("failed, rc=");
      Serial.print(client.state());
      Serial.println(" try again in 5 seconds");
#endif
      delay(5000);
    }
  }
}

void setup() {
  randomSeed(millis());
  Serial.begin(115200);
  setup_wifi();

  client.setServer(mqtt_server, 1883);

  if (!client.connected()) {
    reconnect();
  }
  client.loop();
  client.publish(mqtt_ping_topic, mqtt_online_message);
  client.loop();

  originalram = ESP.getFreeHeap();
}

void loop() {
  if ((millis() - lastMeasure) > (ping_time * 1000)) {
    lastMeasure = millis();

    client.loop();
    client.publish(mqtt_ping_topic, mqtt_online_message);
    client.loop();

    uint32_t ram = ESP.getFreeHeap();
    Serial.printf("RAM: %d  change %d\n", ram, (ram - originalram ));
  }
  delay(30);
}

Debug log:

07:49:35.414 -> SDK:2.2.2-dev(38a443e)/Core:2.6.3=20603000/lwIP:STABLE-2_1_2_RELEASE/glue:1.2-16-ge23a07e/BearSSL:89454af
07:49:35.414 -> 
07:49:35.414 -> Connecting to XXXXX
07:49:35.414 -> bcn 0
07:49:35.414 -> del if1
07:49:35.414 -> usl
07:49:35.414 -> mode : sta(84:f3:eb:db:5a:3c)
07:49:35.414 -> add if0
07:49:35.586 -> wifi evt: 8
07:49:36.171 -> .1.2wifi evt: 2
07:49:36.378 -> .3.4.5.6.7.8.9.10.11.12.13.14.15.16.17.18.19.20.21.22.23scandone
07:49:39.389 -> state: 0 -> 2 (b0)
07:49:39.389 -> .24state: 2 -> 3 (0)
07:49:39.389 -> state: 3 -> 5 (10)
07:49:39.389 -> add 0
07:49:39.389 -> aid 1
07:49:39.389 -> cnt 
07:49:39.492 -> .25
07:49:39.492 -> connected with XXXXXX, channel 6
07:49:39.561 -> dhcp client start...
07:49:39.561 -> wifi evt: 0
07:49:39.561 -> .26ip:192.168.88.122,mask:255.255.255.0,gw:192.168.88.1
07:49:39.596 -> wifi evt: 3
07:49:39.666 -> .27
07:49:39.666 -> WiFi connected - ESP IP address: 192.168.88.122
07:49:41.182 -> Attempting MQTT connection...[hostByName] Host: 111.111.111.111 is a IP!
07:49:41.182 -> :ref 1
07:49:41.216 -> :wr 70 0
07:49:41.216 -> :wrc 70 70 0
07:49:41.216 -> :ack 70
07:49:41.216 -> :rn 4
07:49:41.216 -> :c0 1, 4
07:49:41.216 -> connected
07:49:41.216 -> :wr 23 0
07:49:41.216 -> :wrc 23 23 0
07:49:41.285 -> :ack 23
07:49:49.372 -> pm open,type:2 0
07:50:03.705 -> :rcl
07:50:03.705 -> :abort
07:50:05.117 -> RAM: 50008  change 1296
07:50:35.151 -> RAM: 49976  change 1264
07:51:05.188 -> RAM: 49944  change 1232
07:51:35.216 -> RAM: 49944  change 1232
07:52:05.217 -> RAM: 49944  change 1232
07:52:35.249 -> RAM: 49880  change 1168
07:53:05.287 -> RAM: 49816  change 1104
07:53:35.307 -> RAM: 49784  change 1072
07:54:05.336 -> RAM: 49784  change 1072
07:54:35.374 -> RAM: 49688  change 976
07:55:05.382 -> RAM: 49560  change 848
07:55:35.429 -> RAM: 49464  change 752
07:56:05.438 -> RAM: 49464  change 752
07:56:35.483 -> RAM: 49336  change 624
07:57:05.512 -> RAM: 49080  change 368
07:57:35.547 -> RAM: 48856  change 144
07:58:05.566 -> RAM: 48648  change -64
07:58:35.574 -> RAM: 48488  change -224
07:59:05.604 -> RAM: 48008  change -704
07:59:35.638 -> RAM: 47528  change -1184
08:00:05.669 -> RAM: 47368  change -1344
08:00:35.683 -> RAM: 47176  change -1536

Activity

self-assigned this
on Feb 3, 2020
Jeroen88

Jeroen88 commented on Feb 3, 2020

@Jeroen88
Contributor

I am suspecting LWIP in a similar situation on the ESP32, see here, in this case (using a WiFiClientSecure) I suspect ssl_client->socket = lwip_socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) in ssl_client, because if I add lwip_close(ssl_client->socket) directly following this line, memory has leaked. These are lwip calls and so have nothing to do with mbedtls / sll and should thus be comparable to your situation. I am not sure, because I do not now if a combination of lwip_socket() followed by a lwip_close() should release all memory, I am not familiar enough with LWIP, but that seems logical to me. I am also not sure if the ESP32 uses the same lwip version.

TD-er

TD-er commented on Feb 3, 2020

@TD-er
Contributor

I have something similar here.
It looks like the leak is related to making a WiFiClient connection.
I am running a test here in which my node cannot connect to a MQTT broker, so every N seconds (30 I believe) it will re-attempt a connection and after roughly 18 minutes it is out of memory.
Not sure if it is related to unsuccessful reconnects or not.

I've also tried to recreate the WiFiClient object again at each reconnect attempt, but that doesn't seem to make any difference.

mqtt = WiFiClient(); // workaround see: https://github.com/esp8266/Arduino/issues/4497#issuecomment-373023864 
civilman2006

civilman2006 commented on Feb 4, 2020

@civilman2006
Author

I have something similar here.
It looks like the leak is related to making a WiFiClient connection.
I am running a test here in which my node cannot connect to a MQTT broker, so every N seconds (30 I believe) it will re-attempt a connection and after roughly 18 minutes it is out of memory.
Not sure if it is related to unsuccessful reconnects or not.

I've also tried to recreate the WiFiClient object again at each reconnect attempt, but that doesn't seem to make any difference.

mqtt = WiFiClient(); // workaround see: https://github.com/esp8266/Arduino/issues/4497#issuecomment-373023864 

It seems to be a different case because I create only 1 connection at setup and then at the loop cycle only send 1 mqtt message and even without any mqtt message memory leak will present. So this is about something inside IwIP lib because switching to the oldest version 1.4 helps to resolve the situation!

In yours case, in my mind, it's about creating new connections and after fail attempt to establish it there is time_wait state which destroys connection only after 2 minutes delay.

TD-er

TD-er commented on Feb 4, 2020

@TD-er
Contributor

@civilman2006 I agree it does look like a different set of symptoms.

Last night I als tested with the latest Git head of esp8266/Arduino and that does at least seem to solve this reboot issue due to memory exhaustion.
It looks like this commit may have fixed that. So maybe you can test also with the latest merges, just to be sure?

The issue I was talking about needed roughly 18 minutes to run out of memory, so I am not sure those idle connection attempts were destroyed after 2 minutes.

d-a-v

d-a-v commented on Feb 4, 2020

@d-a-v
Collaborator

I just tried the OP sketch and I have not the same output

Current git master, no debug:

09:37:51.869 -> Connecting to xxxx
09:37:52.565 -> .1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.16.17.18.19.20.21.22.23.24.25
09:37:55.879 -> WiFi connected - ESP IP address: 10.0.1.225
09:37:57.370 -> Attempting MQTT connection...connected
09:38:21.797 -> RAM: 49352  change 1368
09:38:51.825 -> RAM: 49352  change 1368
09:39:21.873 -> RAM: 49352  change 1368
09:39:51.856 -> RAM: 49352  change 1368
09:40:21.873 -> RAM: 49352  change 1368
09:40:51.886 -> RAM: 49352  change 1368
09:41:21.874 -> RAM: 49352  change 1368
09:41:51.889 -> RAM: 49352  change 1368
09:42:21.915 -> RAM: 49352  change 1368
09:42:51.932 -> RAM: 49352  change 1368
09:43:21.948 -> RAM: 49352  change 1368
09:43:51.939 -> RAM: 49352  change 1368
09:44:21.956 -> RAM: 49352  change 1368
09:44:51.974 -> RAM: 49352  change 1368
09:45:21.963 -> RAM: 49352  change 1368
09:45:52.014 -> RAM: 49352  change 1368
09:46:21.999 -> RAM: 49352  change 1368
09:46:52.018 -> RAM: 49352  change 1368
09:47:22.035 -> RAM: 49352  change 1368
09:47:52.050 -> RAM: 49352  change 1368
09:48:22.067 -> RAM: 49352  change 1368

2.6.3, debug enabled

09:54:38.670 -> SDK:2.2.2-dev(38a443e)/Core:2.6.3=20603000/lwIP:IPv6+STABLE-2_1_2_RELEASE/glue:1.2-31-gf839746/BearSSL:89454af
09:54:38.670 -> wifi evt: 2
09:54:38.670 -> 
09:54:38.704 -> Connecting to yyyyy
09:54:38.704 -> scandone
09:54:38.803 -> scandone
09:54:38.803 -> state: 0 -> 2 (b0)
09:54:38.836 -> state: 2 -> 3 (0)
09:54:38.836 -> state: 3 -> 5 (10)
09:54:38.836 -> add 0
09:54:38.836 -> aid 1
09:54:38.836 -> cnt 
09:54:38.869 -> 
09:54:38.869 -> connected with yyyyy, channel 1
09:54:38.869 -> dhcp client start...
09:54:38.869 -> wifi evt: 0
09:54:38.869 -> ip:10.0.1.225,mask:255.255.255.0,gw:10.0.1.254
09:54:38.902 -> wifi evt: 3
09:54:39.201 -> 
09:54:39.201 -> WiFi connected - ESP IP address: 10.0.1.225
09:54:39.765 -> ip:10.0.1.225,mask:255.255.255.0,gw:10.0.1.254
09:54:40.660 -> ip:10.0.1.225,mask:255.255.255.0,gw:10.0.1.254
09:54:40.693 -> Attempting MQTT connection...[hostByName] Host: 10.0.1.253 is a IP!
09:54:40.693 -> :ref 1
09:54:40.693 -> :wr 49 0
09:54:40.693 -> :wrc 49 49 0
09:54:40.742 -> :ack 49
09:54:40.742 -> :rn 4
09:54:40.742 -> :c0 1, 4
09:54:40.742 -> connected
09:54:40.742 -> :wr 18 0
09:54:40.742 -> :wrc 18 18 0
09:54:40.742 -> :ack 18
09:54:41.654 -> ip:10.0.1.225,mask:255.255.255.0,gw:10.0.1.254
09:54:48.845 -> pm open,type:2 0
09:55:02.699 -> :rcl
09:55:02.699 -> :abort
09:55:08.632 -> RAM: 48088  change 1392
09:55:38.626 -> RAM: 48088  change 1392
09:56:08.652 -> RAM: 48088  change 1392
09:56:38.662 -> RAM: 48088  change 1392
09:57:08.654 -> RAM: 48088  change 1392
09:57:38.674 -> RAM: 48088  change 1392
09:58:08.699 -> RAM: 48088  change 1392
d-a-v

d-a-v commented on Feb 4, 2020

@d-a-v
Collaborator

@civilman2006 lwip2 version you use is "glue:1.2-16" which is about the one shipped with 2.6.3 while I have used "glue:1.2-31". That may explain that.

Unfortunately, #6887 is not merged yet.
Are you able to test it ?
If not I can try to generate an alpha release so you can try with the arduino board installer.

d-a-v

d-a-v commented on Feb 4, 2020

@d-a-v
Collaborator

Because I don't see what should have changed about an eventual memory leak, I tried with the same lwip2 version, and unfortunately I can see no leak.

10:27:35.598 -> SDK:2.2.2-dev(38a443e)/Core:2.6.3=20603000/lwIP:STABLE-2_1_2_RELEASE/glue:1.2-17-g354887a/BearSSL:89454af
10:27:35.598 -> wifi evt: 2
10:27:35.598 -> 
10:27:35.598 -> Connecting to wwwwww
10:27:35.664 -> scandone
10:27:36.294 -> .1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.16.17.18.19.20.21.22.23scandone
10:27:39.507 -> state: 0 -> 2 (b0)
10:27:39.507 -> .24state: 2 -> 3 (0)
10:27:39.507 -> state: 3 -> 5 (10)
10:27:39.507 -> add 0
10:27:39.507 -> aid 1
10:27:39.507 -> cnt 
10:27:39.540 -> 
10:27:39.540 -> connected with wwwwwww, channel 1
10:27:39.540 -> dhcp client start...
10:27:39.540 -> wifi evt: 0
10:27:39.573 -> ip:10.0.1.225,mask:255.255.255.0,gw:10.0.1.254
10:27:39.573 -> wifi evt: 3
10:27:39.606 -> .25
10:27:39.606 -> WiFi connected - ESP IP address: 10.0.1.225
10:27:41.097 -> Attempting MQTT connection...[hostByName] Host: 10.0.1.253 is a IP!
10:27:41.097 -> :ref 1
10:27:41.097 -> :wr 49 0
10:27:41.097 -> :wrc 49 49 0
10:27:41.130 -> :ack 49
10:27:41.130 -> :rn 4
10:27:41.130 -> :c0 1, 4
10:27:41.130 -> connected
10:27:41.130 -> :wr 18 0
10:27:41.130 -> :wrc 18 18 0
10:27:41.130 -> :ack 18
10:27:49.511 -> pm open,type:2 0
10:28:02.697 -> :rcl
10:28:02.697 -> :abort
10:28:05.547 -> RAM: 50088  change 408
10:28:35.540 -> RAM: 50088  change 408
10:29:05.557 -> RAM: 50088  change 408
10:29:35.583 -> RAM: 50088  change 408
10:30:05.575 -> RAM: 50088  change 408
10:30:35.592 -> RAM: 50088  change 408
10:31:05.581 -> RAM: 50088  change 408
10:31:35.601 -> RAM: 50088  change 408
civilman2006

civilman2006 commented on Feb 4, 2020

@civilman2006
Author

@civilman2006 lwip2 version you use is "glue:1.2-16" which is about the one shipped with 2.6.3 while I have used "glue:1.2-31". That may explain that.

Unfortunately, #6887 is not merged yet.
Are you able to test it ?
If not I can try to generate an alpha release so you can try with the arduino board installer.

Thx for the reply!

I try to like 2 hours to find how to update glue from 1.2-16 to 1.2-31+, but I can't find the right way to do it at windows, because I don't have 'make' and so on... I update the board from git, that works fine, but I have the same version of glue = 1.2-16 and got the same error with a memory leak.

If it possible comment me on how to update glue?
My current version of SDK & so on:
SDK:2.2.2-dev(38a443e)/Core:2.6.3-44-g6be56161=20603044/lwIP:STABLE-2_1_2_RELEASE/glue:1.2-16-ge23a07e/BearSSL:0645c68 (from esp8266 debug message)

And in the next message, you wrote that there is no problem at old glue - but the version is different glue:1.2-17 and mine is 1.2-16...

d-a-v

d-a-v commented on Feb 4, 2020

@d-a-v
Collaborator

1.2-16 and 1.2-17 make no difference.
lwip2 is updated by #6887 which is yet to be merged. You can however try a pull -request with for example this gist.
But as I said, it may not be the issue. More testers are needed, able to reproduce your issue.

civilman2006

civilman2006 commented on Feb 5, 2020

@civilman2006
Author

If not I can try to generate an alpha release so you can try with the arduino board installer.

Thx for the reply! I can't find way to update with #6887 because windows Arduino IDE & no compiler & make tool... So if you can help me with providing a link to some instruction I can try to test this future merge.. Or I can try alpha release...

Juppit

Juppit commented on Feb 10, 2020

@Juppit
Contributor
civilman2006

civilman2006 commented on Feb 16, 2020

@civilman2006
Author

Thx for the script!
I install and successfully build local version BUT I have same issue with memory because version of Glue is not updated?!
SDK:2.2.2-dev(38a443e)/Core:2.6.3-56-g5efdc776=20603056/lwIP:STABLE-2_1_2_RELEASE/glue:1.2-16-ge23a07e/BearSSL:0645c68

I think that I correctly install boards: "C:\Users\Дмитрий\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.7.0-dev-nightly+20200216"\
only this build is installed... I select Generic 8266 -> 4M (3SPIFF).. Dunno why glue still the same version?

I just updated a script on https://gist.github.com/Juppit/5e1e61eceb4c9a63136ff4d5411b5ff1, which will create - a cygwin installation on windows - downloads the esp8266 repository - and build a board support package for the Arduino IDE. A http server will deliver the stuff at least to the IDE wenn you use 'http://localhost:8000/versions/package_esp8266com_index.json' in the IDE settings for the IDE. On 05.02.2020 at 06:35 wrote Dmitriy Khizhinskiy:

If not I can try to generate an alpha release so you can try with the arduino board installer. Thx for the reply! I can't find way to update with #6887 <#6887> because windows Arduino IDE & no compiler & make tool... So if you can help me with providing a link to some instruction I can try to test this future merge.. Or I can try alpha release... — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#7059?email_source=notifications&email_token=ACSTHJC2RKV5XY6PLCL66DLRBJFZRA5CNFSM4KO66GH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEK2GN2Q#issuecomment-582248170>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACSTHJCILD7DAV4HOREH6C3RBJFZRANCNFSM4KO66GHQ.

laercionit

laercionit commented on Mar 15, 2020

@laercionit
Contributor

HELLO, I'm having this problem using the 2.6.3 kernel, I still haven't found anything to solve.
Has anyone been successful in solving this?

27 remaining items

philbowles

philbowles commented on Apr 17, 2020

@philbowles

Latest: user has throttled his router and ESP now receiving max 100k/s all traffic ...and - heap stable @ 95% of start value, so its back to looking like a rate thing with ESP / core / lwip not able to free packets fast enough AND that triggering a memleak.

I don't know what else I can do., but happy try try sensible and polite suggestions

TD-er

TD-er commented on Apr 17, 2020

@TD-er
Contributor

The idea I had when suggesting it was something like this.

Just assume the unprocessed packets remain in memory for N seconds until they are cleared.
So the amount of heap allocations can remain constant as long as it does have enough time to process all stored packets at less than the rate they come in.
If you exceed that threshold once, you will see it uses a bit more memory, but it should keep up later on.

This all goes well as long as the time needed to process it remains constant and the rate of messages fluctuates so that it gets below the limit the node can handle every now and then.
But the time needed to process is not constant if the memory gets more fragmented.
New allocations do take more time on fragmented memory.
So after each 'burst' of messages, it may use a bit more memory and the heap does get more fragmented and thus increases the processing time.

Given this theory, you would see an increase of speed at which the free heap declines. Or at least as long as the average rate of packets remains constant and just about the initial threshold of what the node can handle. Also the rate must fluctuate to see this happening.

TD-er

TD-er commented on Apr 17, 2020

@TD-er
Contributor

A simple test could be to run the ESP at 160 MHz. If it can keep up longer with the other conditions the same, then my theory is a bit more plausible.

philbowles

philbowles commented on Apr 17, 2020

@philbowles

@TD-er thats our plan for tomorrow: Run 1: unthrottle router re-run @ 160Mhz, Run 2: throttle router to 100kb/s then rerun

d-a-v

d-a-v commented on Apr 17, 2020

@d-a-v
Collaborator

@philbowles #6895 was intended to solve a similar UDP issue (just read #6831).
Are you using 2.6.3 or did you and your tester tried using latest master ?

philbowles

philbowles commented on Apr 17, 2020

@philbowles

@d-a-v Sorry still using 2.6.3 trying to nail the beast - if its been fixed we are wasting time, so will try to get him to do latest master tomorrow, merci! (et salutations de L'Orne 61330) :)

devyte

devyte commented on Apr 18, 2020

@devyte
Collaborator

The stack dump above says core 2.5.2, not 2.6.3.

devyte

devyte commented on Apr 18, 2020

@devyte
Collaborator

@d-a-v in case you haven't noticed it, @philbowles is using the AsyncUDP lib, not ours.

philbowles

philbowles commented on Apr 18, 2020

@philbowles

Good news and bad news. "My man" has rerun some tests this morning and...debuggers worst nightmare: It's gone away. Nothing he has tried (2.5.2 reversion, 2.6.3 etc) will now cause the heap loss.

Even more surprising his SkyQ box is still flooding the netwrok, and my logger shows it is actually peaking at 55 broadcasts/second and stable as a rock, bouncing up n down between 80% and 95% as the rate fluctuates, but basically, "flatlining"

Tail of ths a.m. log after 45 minutes uptime:


09:42:13.982 -> From 192.168.1.110:35442 1554678487 38064 93% rate=6/sec (max 55)
09:42:13.982 -> ******************* Heap GAIN 656 bytes ******************
09:42:13.982 -> From 192.168.1.110:34212 1554695180 38720 95% rate=7/sec (max 55)
09:42:14.015 -> ******************* Heap LOSS 1840 bytes ******************
09:42:14.015 -> From 192.168.1.110:42525 1554712194 36880 91% rate=8/sec (max 55)
09:42:14.015 -> ******************* Heap GAIN 640 bytes ******************
09:42:14.015 -> From 192.168.1.110:34916 1554728689 37520 92% rate=9/sec (max 55)
09:42:14.049 -> ******************* Heap GAIN 592 bytes ******************
09:42:14.049 -> From 192.168.1.110:34916 1554745372 38112 94% rate=10/sec (max 55)
09:42:14.049 -> ******************* Heap GAIN 600 bytes ******************
09:42:14.049 -> From 192.168.1.110:34007 1554762205 38712 95% rate=11/sec (max 55)
09:42:14.083 -> ******************* Heap GAIN 632 bytes ******************
09:42:14.083 -> From 192.168.1.110:34916 1554779013 39344 97% rate=12/sec (max 55)
09:42:14.797 -> ******************* Heap GAIN 320 bytes ******************
09:42:14.797 -> From 192.168.1.101:63931 1555517278 39664 97% rate=1/sec (max 55)

His bewildered suggestion is that his boxes got firmware uploaded overnight. At a total loss for an explanation, I tend to agree with him, but only because I can think of few other realistic explanations. :(

The only +ve from this is that rate does not now seem to be the core issue. We think "bad packet by sky box (now fixed)" is/was the answer.

I wish I could tell you something different, but now neither of us can reproduce the problem.

I am still happy to try to help if i can , of course.

TD-er

TD-er commented on Apr 18, 2020

@TD-er
Contributor

Maybe you also switched WiFi channels on the ESP, to one with less disturbances (less retransmits)?

FinduschkaLi

FinduschkaLi commented on Apr 18, 2020

@FinduschkaLi

Hi I am following this thread closely, since I have a prob with ESP8266 resetting since 6 months.
I run a websocket client and observe resets several times a day (in particular with one particular wifi hotspot consisting in a wifi repeater, I could not yet reproduce the same behaviour on my mobile phone hotspot)

I basically loose heap in every reconnection to Wifi attempt when my router is switched off and can't be reached...

I tried a bunch of different things, no solution yet. Currently running tests with the beta 0.0.2 as indicated above, but behaviour stays the same. Will share the next stack prints. I am currently trying to make a minimal version to be able to reproduce the behaviour.

Library Version 2.6.3 + Beta 0.0.2
Lwip v2 - lower Memory
CPU 80Mhz

Let me know if I can be of any help to this.

devyte

devyte commented on Apr 18, 2020

@devyte
Collaborator

@FinduschkaLi that sounds like a completely different problem. Please don't hijack this thread, which is specific to a reported mem leak on each loop.

devyte

devyte commented on Apr 18, 2020

@devyte
Collaborator

@philbowles it sounds to me like your friend had a corrupted build. I've seen that reported, and a clean build from scratch would make the problem go away.

devyte

devyte commented on Apr 18, 2020

@devyte
Collaborator

@civilman2006 Your original mcve uses pubsub mqtt. I've seen mem leaks reported when using that 3rd party lib, and failure to reproduce without it using just our core. I suggest working with tbe authors of pubsub to reach a mcve that uses only our core.
I'm closing this. If any of the involved parties can produce a mcve that shows the mem leak and that doesn't use 3rd party libs, please open a new issue and follow the instructions in the issue template.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @earlephilhower@laercionit@TD-er@d-a-v@Juppit

      Issue actions

        Memory leak each loop at IwIP: 2 LM & HB (not at 1.4 HB) · Issue #7059 · esp8266/Arduino